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Chapter  0 


Accomplishments 


Estimating  the  disturbance  or  clutter  covariance  is  a  centrally  important  problem  in  radar  space  time 
adaptive  processing  (STAP)  since  estimation  of  the  disturbance  or  interference  covariance  matrix  plays 
a  central  role  on  radar  target  detection  in  the  presence  of  clutter,  noise  and  a  jammer.  The  distur¬ 
bance  covariance  matrix  should  be  inferred  from  training  sample  observations  in  practice.  Traditional 
maximum  likelihood  (ML)  estimators  are  effective  when  homogeneous  (target  free)  training  data  is 
abundant  but  lead  to  poor  estimates,  degraded  false  alarm  rates,  and  detection  loss  in  the  regime  of 
limited  training.  However,  large  number  of  homogeneous  training  samples  are  generally  not  available 
because  of  difficulty  of  guaranteeing  target  free  disturbance  observation,  practical  limitations  imposed 
by  the  spatio-temporal  nonstationarity,  and  system  considerations.  The  problem  has  been  exacerbated 
by  recent  advances  that  have  led  to  more  antenna  elements  (J)  and  higher  temporal  resolution  (P) 
time  epochs  resulting  in  a  large  dimension  (N  =  JP). 

In  this  report,  we  look  to  address  the  aforementioned  challenges  by  exploiting  physically  inspired 
constraints  into  ML  estimation.  While  adding  constraints  is  beneficial  to  achieve  satisfactory  perfor¬ 
mance  in  the  practical  regime  of  limited  training,  it  leads  to  a  challenging  problem.  Unlike  uncon¬ 
strained  estimators,  a  vast  majority  of  constrained  radar  STAP  estimators  are  iterative  and  expensive 
numerically,  which  prohibits  practical  deployment.  We  focus  on  breaking  this  classical  trade-off  be¬ 
tween  computational  tractability  and  desirable  performance  measures,  particularly  in  training  starved 
regimes.  In  particular,  we  exploit  both  the  structure  of  the  disturbance  covariance  and  importantly 
the  knowledge  of  the  clutter  rank  to  yield  a  new  rank  constrained  maximum  likelihood  (RCML) 
estimator  of  clutter/disturbance  covariance.  We  demonstrate  that  the  rank-constrained  estimation 
problem  can  in  fact  be  cast  in  the  framework  of  a  tractable  convex  optimization  problem,  and  derive 
closed  form  expressions  for  the  estimated  covariance  matrix.  In  addition,  we  derive  a  new  covariance 
estimator  for  STAP  that  jointly  considers  a  Toeplitz  structure  and  a  rank  constraint  on  the  clutter 
component.  Past  work  has  shown  that  in  the  regime  of  low  training,  even  handling  each  constraint 
individually  is  hard  and  techniques  often  resort  to  slow  numerically  based  solutions.  Our  proposed 
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solution  leverages  the  rank  constrained  ML  estimator  (RCML)  of  structured  covariances  to  build  a 
computationally  friendly  approximation  that  involves  a  cascade  of  two  closed  form  solutions.  Per¬ 
formance  analysis  using  the  KASSPER  data  set  (where  ground  truth  covariance  is  made  available) 
shows  that  the  proposed  RCML  estimator  vastly  outperforms  state-of-the  art  alternatives  even  for 
low  training  including  the  notoriously  difficult  regime  oi  K  <  N  training  regimes  and  for  the  exper¬ 
iments  considering  real-world  scenarios  such  as  target  detection  performance  and  the  case  that  some 
of  training  samples  are  corrupted  by  target  information. 

Finally,  we  address  the  problem  of  working  with  inexact  physical  radar  parameters  under  a  practical 
radar  environment.  As  shown  in  this  report,  employing  practical  constraints  such  as  a  rank  of  the 
clutter  subspace  and  a  condition  number  of  disturbance  covariance  leads  to  a  practically  powerful 
estimator  as  well  as  a  closed  form  solution.  While  the  rank  and  the  condition  number  are  very  effective 
constraints,  often  practical  non-ideality  makes  it  difficult  to  be  known  precisely  using  physical  models. 
We  propose  a  robust  covariance  estimation  method  via  an  expected  likelihood  (EL)  approach.  We 
analyze  covariance  estimation  algorithms  under  three  different  cases  of  imperfect  constraints:  1)  only 
rank  constraint,  2)  both  rank  and  noise  power  constraint,  and  3)  condition  number  constraint.  For 
each  case,  we  formulate  estimation  of  the  constraint  as  an  optimization  problem  with  the  expected 
likelihood  criterion  and  formally  derive  and  prove  a  significant  analytical  result  such  as  uniqueness  of 
the  solution.  Through  experimental  results  from  a  simulation  model  and  the  KASSPER  data  set,  we 
show  the  estimator  with  optimal  constraints  obtained  by  the  EL  approach  outperforms  alternatives 
in  the  sense  of  a  normalized  signal-to-interference  and  noise  ratio  (SINR). 
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Chapter  1 


Rank  Constrained  ML  Estimation 
of  Structured  Covariance  Matrices 

1.1  Summary 

Estimating  the  disturbance  or  clutter  covariance  is  a  centrally  important  problem  in  radar  space  time 
adaptive  processing  (STAP).  Traditional  maximum  likelihood  (ML)  estimators  are  effective  when 
training  data  is  abundant  but  lead  to  poor  estimates,  degraded  false  alarm  rates,  and  detection  loss 
in  the  realistic  regime  of  limited  training.  The  problem  is  exacerbated  by  recent  advances  which 
have  led  to  high  dimensionality  N  of  the  observations  arising  from  increased  antenna  elements  (J) 
as  well  as  higher  temporal  resolution  (P  time  epochs  and  finally  N  =  JP).  This  work  introduces 
physically  inspired  constraints  into  ML  estimation.  In  particular,  we  exploit  both  the  structure  of  the 
disturbance  covariance  and  importantly  the  knowledge  of  the  clutter  rank  to  yield  a  new  rank  con¬ 
strained  maximum  likelihood  (RCML)  estimator  of  clutter/disturbance  covariance.  We  demonstrate 
that  the  rank-constrained  estimation  problem  can  in  fact  be  cast  in  the  framework  of  a  tractable 
convex  optimization  problem,  and  derive  closed  form  expressions  for  the  estimated  covariance  matrix. 
Performance  analysis  using  the  KASSPER  data  set  (where  ground  truth  covariance  is  made  avail¬ 
able)  shows  that  the  proposed  estimator  vastly  outperforms  state-of-the  art  alternatives  in  the  sense 
of  higher  normalized  signal  to  interference  and  noise  ratio  (SINR).  Crucially  the  proposed  RCML 
estimator  can  excel  even  for  low  training  including  the  notoriously  difficult  regime  of  K  <  N  training 
samples. 
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2-D 


Angle  Projection 


Figure  1.1:  The  target  and  interference  scenario  in  an  airborne  radar. 

1.2  Introduction 

1.2.1  Spatio-Temporal  Adaptive  Processing  in  Radar 

the  radar  receiver  front  end  consists  of  an  array  of  J  antenna  elements,  which  receives  signals  from 
targets,  clutter,  and  jammers.  These  reflections  induce  a  voltage  at  each  element  of  the  antenna  array, 
which  constitutes  the  measured  array  data  at  a  given  time  instant.  Snapshots  of  the  measured  data 
collected  at  P  successive  time  epochs  give  rise  to  the  spatio-temporal  nature  of  the  received  radar 
data.  The  spatio-temporal  product  JP  =  N  is  defined  to  be  the  system  dimensionality.  Fig.  1.1  uses 
the  angle-Doppler  space  to  illustrate  the  need  for  space-time  adaptive  processing  (STAP).  A  target  at 
a  specific  angle  and  traveling  at  a  specific  velocity  (corresponding  to  a  Doppler  frequency)  occupies  a 
single  point  in  this  space.  A  jammer  originates  from  a  particular  angle  but  is  temporally  (in  Doppler 
domain)  white  and  the  clutter  occupies  a  ridge  in  this  2-D  space.  Consequently  the  target  signal  is 
masked  by  white  jammer  in  Doppler  domain  and  the  clutter  in  spatial  domain  and  therefore,  carrying 
out  merely  temporal  (Doppler  domain)  or  spatial  (angle  domain)  processing  fails  to  separate  the  target 
from  the  interference  [1].  On  the  other  hand,  joint  domain  processing  in  angle  and  Doppler  enables 
target  detection  as  shown  in  Fig.  1.1. 

The  target  detection  problem  can  be  cast  in  the  framework  of  a  statistical  hypothesis  test  of  the 
form 


Hq-.x  =  d  =  c-|-j-l-n  (1.1) 

Hi:x  =  as{et,  ft) +  d  =  as{0t,  ft) +c+i  +  n  (1.2) 
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where  x  G  is  a  vector  form  of  the  received  data  under  either  hypothesis,  d  represents  the 

overall  disturbance  which  is  the  sum  of  c,  clutter,  j,  jammers,  and  n,  the  background  white  noise. 
The  vector  s  is  a  known  spatio-temporal  steering  vector  that  represents  the  signal  returned  from  the 
target  for  a  specific  angle  and  Doppler  and  a  is  the  unknown  target  complex  amplitude. 

The  whiten-and-match  filter  (MF)  for  detecting  a  rank-1  signal  is  the  optimum  processing  method 
for  Gaussian  interference  statistics.  It  is  given  by  [2] 


w  = 


^  ^MF 


Ho 


^MF 


(1.3) 


where  R^  is  a  known  interference  covariance  matrix.  Eq.  (1.3)  represents  the  matched  filtering  of  the 
whitened  data  x  =  R^  ^^^x  and  whitened  steering  vector  s  =  R^  From  Eq.  (1.3),  it  turns  out 
that  the  interference  covariance  matrix  R,;  is  crucial  in  the  detection  statistic. 


1.2.2  Motivation  and  Review 

Because  the  covariance  matrix  plays  a  crucial  role  in  the  detection  statistic  (see  Eq.  (1.3)),  it  is  very 
important  to  estimate  it  reliably.  Widrow  et  al.  and  Applebaum  proposed  least-squares  method  [3]  and 
maximum  signal-to- noise- ratio  criterion  [4],  respectively,  using  feedback  loops,  respectively.  However, 
these  methods  were  slow  to  converge  to  the  steady-state  solution.  Reed,  Mallet,  and  Brennan  [5] 
verihed  that  the  sample  matrix  inverse  (SMI)  method  demonstrated  considerably  better  convergence. 
In  the  sample  matrix  inverse  method,  the  disturbance  covariance  matrix  can  be  estimated  using  K 
data  ranges  for  training 

(1-4) 

fc=i 

where  K  is  the  number  of  training  data  we  received,  xj,  G  C'^,  N  =  JP  is  the  fcth  vector  of  training 
data,  and  X  =  [xi  X2  ...  xk]  G  It  is  well  known  that  the  sample  covariance  is  the  uncon¬ 

strained  maximum  likelihood  estimator  when  K  >  N.  Despite  this  virtue,  there  remain  fundamental 
problems  with  the  SMI  approach.  First,  typically  K  >  N  training  samples  are  needed  to  guarantee 
the  non-singularity  of  the  estimated  covariance  matrix.  In  fact,  when  K  <  N  the  estimate  is  singular 
and  cannot  be  inverted  which  is  highly  undesirable  in  STAP.  As  much  past  research  as  shown  [6] ,  the 
estimate  also  does  quite  poorly  in  the  vicinity  oi  K  =  N  training  samples. 

large  number  of  homogenous  training  samples  are  generally  not  available  [7] .  One  reason  is  that  it 
is  hard  to  guarantee  target  free  disturbance  observations.  There  are  also  severe  practical  limitations 
imposed  by  the  spatio-temporal  nonstationarity  of  the  interference  as  well  as  by  system  considerations 
such  as  bandwidth  and  fast  scanning  arrays.  It  is  well  known  that  K  =  2N  training  samples  are  needed 
to  keep  the  performance  within  3dB  of  the  optimal  processor.  For  example  with  J  =  11  and  P  =  32, 
the  parameters  for  the  Knowledge  Aided  Sensor  Signal  Processing  Expert  Reasoning  (KASSPER) 
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program  dataset  [8],  K  =  2JP  =  704  training  samples  are  needed.  Assuming  an  instantaneous  RF 
bandwidth  of  500  kHz,  this  requires  wide-sense  stationarity  over  a  400  km  range.  There  are  additional 
factors  which  make  the  training  data  scarce.  They  are  1.)  system  errors  like  aircraft  crabbing  and 
internal  clutter  motion  [1],  2.)  environmental  considerations  such  as  strong  clutter  discretes  [9]  and 
range  varying  interference  spectra  and  power  levels  [10],  and  3.)  outlier  contamination  of  training 
data  by  target-like  signals  [11]. 

To  overcome  the  practical  issue  of  lack  of  generous  training,  researchers  have  developed  approaches 
that  reduces  the  spatio-temporal  DOF,  which  results  in  reductions  in  the  number  of  required  training 
samples  and  computation  cost  as  well  [6].  These  works  are  shown  in  the  Joint  Domain  Localized 
(JDL)  processing  algorithm  [12],  the  Parametric  Adaptive  Matched  Filter  (PAMF)  [13]  and  references 
therein,  the  Multi-Stage  Wiener  Filter  (MSWF)  [14],  and  factored  STAP  methods  [1].  Another 
important  approach  is  the  Direct  Data  Domain  (D^)  approach  [15]  which  is  not  dependent  on  any 
statistical  training.  In  [16],  authors  extend  the  D^  algorithm  to  include  statistical  processing. 

Finally,  using  signal  processing  and  statistical  learning  techniques,  covariance  matrix  estimation 
techniques  that  enforce  and  exploit  particular  structure  have  been  pursued.  Examples  of  structure 
include  persymmetry  [17],  the  Toeplitz  property  [18,  19,  20],  circulant  structure  [21],  multichannel 
autoregressive  models  [13,  22]  and  physical  constraints  [23].  The  FML  method  [24]  which  enforces 
special  eigenstructure  also  falls  in  this  category  and  in  fact  is  shown  to  be  the  most  competitive 
technique  experimentally  [11,  6].  In  particular,  the  disturbance  covariance  matrix  R  represents  the 
exhibits  the  following  structure 

R  =  (T2l-FRe  (1.5) 

where  Rc  denotes  the  clutter  matrix  which  has  a  low  rank  and  is  positive  semi-definite  and  I  is 
an  identity  matrix.  Steiner  and  Gerlach’s  FML  technique  ensures  that  the  estimated  covariance 
matrix  has  eigenvalues  all  greater  than  cr^  by  assuming  that  its  value  is  known  (or  at  least  can  be 
approximately  known  a  priori)  -  which  is  sometimes  unrealistic.  Recently,  the  work  by  Aubry  et  al  [25] 
has  also  improved  upon  FML  by  the  introduction  of  a  condition  number  constraint.  Other  approaches 
include  Bayesian  covariance  matrix  estimators  [26,  27,  28,  29,  30]  and  the  use  of  knowledge-based 
covariance  models  [31,  32,  33,  34].  Finally,  shrinkage  estimation  methods  have  been  also  considered 
[35,  36,  37,  38]. 

1.3  Methods,  Assumptions,  and  Procedures 

1.3.1  Overview  of  Contribution 

The  principal  contribution  of  our  work  us  to  incorporate  the  rank  of  the  clutter  component  Rc 
explicitly  into  ML  estimation  of  the  disturbance  covariance  matrix.  Under  ideal  conditions  (no  mutual 
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coupling  between  array  elements  and  no  internal  clutter  motion),  Brennan  rule  [1]  states  that  the  rank 
of  Rc  in  the  airborne  linear  phased  array  radar  problem  is  given  by 

rank(Rc)  =  J  +  7(P  —  1)  (1.6) 

where  7  =  2vpT/d  is  the  slope  of  the  clutter  ridge,  with  Vp  denoting  the  platform  velocity,  T  denoting 
the  pulse  repetition  interval,  and  d  denoting  the  inter-element  spacing.  Even  if  there  is  mutual  coupling 
in  practice,  Rc  has  rank  r  which  is  much  less  than  the  spatio-temporal  product  N  =  JP  in  many 
practical  airborne  radar  applications.  In  addition,  powerful  techniques  have  been  developed  [11]  to 
determine  the  rank  fairly  accurately. 

We  first  set  up  the  optimization  problem  to  estimate  the  disturbance  covariance  matrix  with  a 
structural  constraint  on  R  and  the  rank  constraint  on  Rc.  The  estimation  problem  when  seen  as  an 
optimization  over  R  is  unfortunately  not  a  convex  problem,  since  neither  the  cost  function  nor  the 
constraints  (rank)  are  convex  (elaborated  upon  in  Section  1.3.2).  We  will  however  show  that  using 
a  transformation  of  variables,  reduction  to  a  convex  form  is  possible  and  further  by  invoking  KKT 
conditions  [39]  for  the  resulting  convex  problems,  it  is  in  fact  possible  to  derive  a  closed  form  solution. 
Akin  to  FML,  we  assume  that  the  noise  power  is  known  while  setting  up  and  solving  the  problem. 

1.3.2  ML  Estimation 

Let  Zi  G  be  the  tth  realization  of  the  target-free  (stochastic)  disturbance  vector  and  K  be  the 
number  of  training  samples.  That  is,  i  =  1,2, . . . ,  K  and  N  =  JP.  Therefore,  under  each  training 
sample,  z^,  under  assumption  of  zero  mean,  obeys 

/(zi)  =  exp(-zf  R-^z,)  (1.7) 

which  comes  from  a  zero-mean  complex  circular  Gaussian  distribution  and  R  is  the  N  x  N  distur¬ 
bance  covariance  matrix.  Further,  |R|  denotes  the  determinant  of  R,  zf^  is  the  Hermitian  (conjugate 
transpose)  of  z^.  Let  Z  be  the  M  x  K  complex  matrix  whose  z-th  column  is  the  observed  vector  z^. 
Since  each  observations  z^  are  i.i.d,  the  likelihood  of  observing  Z  given  R  is  given  by 

/(K)(Z)  =  ^|R|-^exp(-tr{Z«R-iZ})  (1.8) 

=  ^|R|-^exp(-tr{R-iZZ^})  (1.9) 

=  ^|R|-^exp(-iL-tr{R-iS})  (1.10) 
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where  S  =  is  the  well-known  sample  covariance  matrix.  Our  goal  is  to  find  the  positive  definite 

matrix  R  that  maximizes  the  likelihood  function  /(r)(Z).  The  logarithm  of  the  likelihood  term  is 

log/(R)(Z)  =  -K  ■  tr{R^iS}  -  if  log(|R|)  -  iVif  log(7r).  (1.11) 

Maximizing  the  log-likelihood  as  a  function  of  R  is  equivalent  to  minimizing  the  function  given  by 

tr{R-iS}  +  log(|R|).  (1.12) 

Therefore,  Eq.  (1.12)  is  the  cost  function  of  our  optimization  problem.  Since  the  cost  function  is  not  a 
convex  function  in  R,  we  apply  a  transformation  variables  i.e.,  let  X  =  cr^R'^  and  S'  =  ^S.  Then, 
the  revised  cost  function  in  the  optimization  variable  X  becomes 

tr{R-iS}  +  log(|R|) 

=  tr{S'X}-log(|4x|) 

=  HS'XI-logdXD+loga^"^ 

Since  logu^^  in  Eq.  (1.14)  is  a  constant,  the  final  cost  function  to  be  minimized  is 

tr{S'X}-log(|X|).  (1.15) 

N  N 

Note  that  trjS'X}  =  EE  s'jiXij  is  affine  and  log(|X|)  is  concave,  which  implies  —  log(|X|)  is  convex. 

j=l 

Therefore,  the  final  cost  function,  Eq.  (1.15)  is  convex  in  the  variable  X. 

We  now  express  X  and  S'  in  terms  of  their  eigenvalue  decomposition,  i.e.,  X  =  and  using 

the  eigendecompositions,  the  cost  function  can  be  simplified  as 

d^A-l^logA  (1.16) 


(1.13) 

(1.14) 


where  d  and  A  are  vectors  with  entries  of  eigenvalues  of  S'  and  X  respectively.  This  result  is  in  fact 
fairly  well  known  from  standard  unconstrained  ML  estimation  of  non-singular  R. 

We  assume  the  noise  power  is  known.  Then  the  constraints  of  the  optimization  problem  are 


R  =  cr^I  + 
rankCRc)  =  r 
R-c  ^  0 
R^  cr^I 


(1.17) 


Since  rank(Rc)  =  r,  Rc  has  r  non-negative  eigenvalues  and  the  rest  eigenvalues  are  all  zero.  Hence, 


DISTRIBUTION  A:  Approved  for  public  release 


10 


from  Eq.  (1.5),  R  has  r  eigenvalues  which  are  greater  than  or  equal  to  and  the  rest  eigenvalues 
equal  to  cr^.  Hence,  the  eigenvalues  of  X  should  be  satisfy 

Ai  <  A2  <  •  •  •  <  <  A^+i  =  •  •  •  =  Aat  =  1  (1.18) 

Now  the  final  optimization  problem  can  be  expressed  in  vector-matrix  form, 

f  nun  A  —  log  A 

<  s.t.  FA^g  (1-19) 

EA  =  h 


U 

0 

-  _ 

where  F  = 

-I 

,  g  = 

— e 

,E  = 

Orxr  ^rx{N—r) 

^(N—r)xr  ^N—r 

,  and 

I 

1 

h  =  [0,0,---  ,0,.,1,1,---  ,1]^.  Here,  A,  d,  h  G  R^,  g  G  R^^,  U,  E  G  R^^^,  and  F  G  The 

optimization  problem  (1.19)  is  obviously  a  convex  optimization  problem  because  the  cost  function  is 
a  convex  function  and  feasible  constraint  sets  are  convex  as  well. 

A  closed  form  solution  for  (1.19)  can  in  fact  be  derived  using  KKT  conditions  [39]  in  constrained 
optimization.  The  optimal  solution  A*  is 

{min(l,  — )  for  t  =  1,  2, . . . ,  r 

d^  (1.20) 

1  for  i  =  r -|- 1,  r -I- 2, . . . , 

1.4  Results  and  Discussions 

1.4.1  Experimental  Setup  and  Methods  Compared 

Data  from  the  L-band  data  set  of  the  Knowledge  Aided  Sensor  Signal  Processing  and  Expert  Reasoning 
(KASSPER)  program  [8]  is  used  for  the  performance  analysis  discussed  in  this  section.  The  KASSPER 
data  is  the  result  of  a  significant  effort  by  DARPA  to  provide  a  publicly  available  resource  for  the 
evaluation  and  benchmarking  of  radar  STAP  algorithms.  As  elaborated  in  [29],  the  KASSPER  data 
set  was  carefully  captured  to  represent  real-world  ground  clutter  and  captures  variations  in  underlying 
terrain,  foliage  and  urban/manmade  structures.  Further,  the  KASSPER  data  set  exhibits  two  very 
desirable  characteristics  from  the  viewpoint  of  evaluating  covariance  estimation  techniques:  1.)  the 
low-rank  structure  of  clutter  in  KASSPER  has  been  verified  by  researchers  before  [11,  29],  and  2.)  the 
true  covariance  matrices  for  each  range  bin  have  been  made  available  -  this  facilitates  comparisons  via 
powerful  figures  of  merit  where  the  theoretical  upper/lower  bounds  are  known. 

The  L-band  data  set  consists  of  a  data  cube  of  1000  range  bins  corresponding  to  the  returns 
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Table  1.1:  KASSPER  Dataset- 1  parameters 


Parameter 


Value 


Carrier  Frequency 
Bandwidth  (BW) 

Number  of  Antenna  Elements 

Number  of  Pulses 

Pulse  Repetition  Frequency 

1000  Range  Bins 

91  Azimuth  Angles 

128  Doppler  Frequencies 

Clutter  Power 

Number  of  Targets 

Range  of  Target  Dop.  Freq. 


1240  MHz 

10  MHz 

11 
32 

1984  Hz 

35  km  to  50  km 
87°,  89°,  ...  267° 

-992  Hz,  -976.38  Hz,  . . .,  992  Hz 
40  dB 

226  (  200  detectable  targets) 
-99.2  Hz  to  372  Hz 


from  a  single  coherent  processing  interval  from  11(=  J)  channels  and  32(=  P)  pulses.  Therefore,  the 
dimension  of  observations  (or  the  spatio-temporal  product)  V  is  11  x  32  =  352.  Other  key  parameters 
are  detailed  in  Table  2.1.  Finally,  a  clutter  rank^  ofr  =  J-|-T’— 1  =42  was  used  by  our  RCML 
estimator  in  all  the  results  to  follow,  unless  explicitly  stated  otherwise. 

We  evaluate  and  compare  four  different  covariance  estimation  techniques: 

•  Sample  Covariance  Matrix:  The  sample  covariance  matrix  is  given  in  Eq.  (1.4).  It  is  well 
known  that  the  sample  covariance  is  the  unconstrained  maximum  likelihood  estimator  under 
Gaussian  disturbance  statistics.  Consistent  with  radar  literature  [5],  we’ll  refer  to  the  use  of  this 
technique  as  SMI. 

•  Fast  Maximum  Likelihood:  The  fast  maximum  likelihood  (FML)  [24]  uses  the  structural 
constraint  of  the  covariance  matrix  which  is  given  in  Eq.  (1.5).  The  FML  method  just  involves 
calculating  the  eigenvalue  decomposition  of  the  sample  covariance  and  perturbing  eigenvalues  to 
conform  to  the  structure  in  Eq.  (1.5).  The  noise  variance  cr^  is  assumed  known  or  pre- estimated. 
FML’s  success  in  radar  STAP  is  widely  known  [11,  40,  6]. 

•  Leave-one-out  shrinkage  estimator:  Shrinkage  estimators  are  powerful  estimators  of  covari¬ 
ance  for  high  dimensional  data  that  are  known  to  also  perturb  the  eigenstructure  of  the  sample 
covariance  matrix^  [35]  -  often  to  ensure  non-singularity  of  the  estimated  covariance.  While  a 
variety  of  shrinkage  techniques  are  known  [35,  36,  37,  38],  we  choose  the  leave-one-out  covariance 
matrix  estimate  (LOOC)  shrinkage  estimator  [41], 

R  =  Pdiag{S)  +  {1  -  P)S  (1.21) 

The  value  of  /?  is  determined  via  a  cross-validation  technique  so  that  the  average  likelihood  of 

^We  set  clutter  ridge  parameters  so  that  7  =  1. 

^Via  this  definition,  the  FML  and  RCML  can  also  been  seen  as  a  special  class  of  shrinkage  estimators. 
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omitted  samples  is  maximized.  We  pick  this  estimator  because  it  is  most  suited  to  the  problem 
at  hand  and  has  demonstrated  success  in  the  K  <  N  training  regime  [41]. 

•  Eigencanceler:  The  eigencanceler  (EigC)  is  based  on  the  eigenanalysis  which  suggests  a  small 
number  of  eigenvalues  contain  all  the  information  about  interferences  (jammers  and  clutter),  and 
therefore,  the  span  of  the  eigenvectors  associated  with  these  significant  eigenvalues  includes  all 
the  position  vectors  that  comprise  the  interference  signals  [42].  Since  we  assume  that  the  rank 
is  known  a  priori,  the  eigencanceler  can  be  compared  with  our  estimator  as  we  use  r  dominant 
eigenvectors  as  interference  eigenvectors.  The  covariance  matrix  can  be  expressed  by 

r 

R  =  +  cr^I  (1.22) 

i=l 

where  pi  and  are  the  clutter  power  and  the  eigenvector  corresponding  to  r  dominant  eigen¬ 
values,  respectively.  For  pi  ^  it  follows  from  [43,  11]  that  the  estimated  inverse  covariance 

1  ’’ 

matrix  can  be  approximated  as  «  ^(I  —  P)  where  P  =  vpvf .  We  apply  this  inverse 

i—1 

covariance  matrix  in  computing  the  SINK  as  well  as  estimator  variance. 

•  Rank  Constrained  Maximum  Likelihood:  Our  proposed  estimator  (abbreviated  to  RCML) 
incorporates  the  structural  constraint  and  for  the  first  time  the  information  of  the  rank  of  the 
clutter  component. 


1.4.2  Experimental  Evaluation 

The  normalized  signal  to  interference  and  noise  ratio  (SINK)  is  used  for  evaluation  the  aforementioned 
covariance  estimation  techniques.  The  SINK  is  desired  to  be  as  high  as  possible.  This  figure  of  merit 
is  plotted  against  azimuthal  angle  as  well  as  Doppler  frequency  for  distinct  training  regimes,  i.e. 
low,  representative  and  generous  training.  We  also  show  the  plot  of  SINK  performance  versus  the 
number  of  training  samples.  Finally,  we  also  evaluate  the  robustness  of  our  RCML  estimator  against 
perturbations  in  the  knowledge  of  the  true  rank. 


Normalized  SINR  vs.  angle  and  Doppler 

The  normalized  SINR  measure  [44]  is  commonly  used  in  the  radar  literature  and  is  given  by 


js^R 

[s^R-iRR-islls^R-is] 


(1.23) 


where  s  is  the  spatio-temporal  steering  vector,  R  is  an  estimated  covariance  matrix,  and  R  is  the 
corresponding  true  covariance  matrix.  It  is  easily  seen  that  0  <  ?7  <  1  and  r;  =  1  if  and  only  if 
R  =  R.  Since  the  steering  vector  is  a  function  of  both  azimuthal  angle  and  Doppler  frequency,  we 
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evaluate  the  normalized  SINK  in  both  angle  and  Doppler  domain.  This  would  lead  to  a  SINK  surface 
as  a  function  of  azimuthal  angle  and  Doppler  and  comparing  surface  plots  across  different  covariance 
estimation  techniques  is  cumbersome.  We  therefore  obtain  plots  as  a  function  of  one  variable  (i.e.  just 
angle/Doppler)  by  marginalizing  (averaging)  over  the  other  variable.  The  SINK  is  plotted  in  dB.  in 
all  figures  in  this  chapter,  that  is,  SINRdB  =  10  log^Q  77.  Therefore,  SINRdB  <  0. 

Fig.  1.2  plots  the  variation  of  normalized  SINR  as  a  function  of  the  azimuthal  angle  and  the 
Doppler  frequency  for  varying  number  of  training  samples,  K.  Specifically,  Figs.  1.2  (a)  and  (b)  are 
corresponding  to  K  =  300  <  N  =  352,  Figs.  1.2  (c)  and  (d)  plots  results  for  K  =  352  =  N,  likewise 
Figs.  1.2  (e),(f)  and  (g),(h)  are  corresponding  to  K  =  750  «  2N  and  K  =  3000  »  N  respectively. 

Figs.  1.2  (a),(b)  and  (c),  (d)  report  results  for  the  challenging  regime  oi  K  <  N.  When  K  <  N 
the  sample  covariance  matrix  is  not  invertible,  hence  for  the  results  in  Figs.  1.2  (a),(b)  we  used  its 
pseudo-inverse  as  a  substitute.  Unsurprisingly,  the  sample  covariance  technique  suffers  tremendously 
when  RT  <  TV  as  is  evident  from  Figs.  1.2  (a)-(d).  LOOC  shrinkage  does  considerably  better  than 
SMI  because  it  forces  a  reasonably  good  eigenstructure.  The  informed  estimators,  i.e.  FML,  EigC, 
and  RCML  perform  appreciably  well  with  RCML  affording  the  best  overall  performance.  It  is  useful 
to  note  that  RCML  in  fact  offers  about  1  dB  improvement  over  FML. 

Even  for  representative  training  in  Figs.  1.2  (e)-(f),  the  vastly  superior  performance  of  the  FML, 
EigC,  and  RCML  techniques  is  apparent.  Again,  by  virtue  of  incorporating  the  rank  information,  the 
proposed  RCML  estimator  outperforms  the  competing  methods.  Einally,  Figs.  1.2  (g)-(h)  confirm  the 
intuition  that  as  training  becomes  close  to  asymptotic,  the  gap  between  the  various  methods  begins 
to  decrease  -  of  course,  such  generous  training  is  typically  impossible  to  obtain  in  practice.  This  is 
due  to  the  fact  that  all  the  covariance  matrix  estimates  considered  converge  to  the  true  covariance 
matrix  in  the  limit  of  large  training  data. 

Performance  vs.  number  of  training  samples 

While  the  results  in  Sections  1.4.2  do  explore  performance  against  training  to  some  extent  -  here  we 
present  bar  graphs  to  explore  this  issue  with  a  finer  granularity.  To  obtain  a  single  scalar  performance 
measure  as  a  function  of  training,  averaging  was  carried  out  over  both  the  angle  and  Doppler  variables. 

Fig.  1.3  (a)  and  (b)  present  bar  graphs  that  quantify  the  SINR  and  estimator  variance  (both  in 
dB)  as  a  function  of  training  samples  AT,  where  K  is  varied  from  as  low  as  60  to  as  high  as  3000. 
Two  trends  are  evident  from  Fig.  1.3  (a):  1.)  as  intuitively  expected,  the  SINR  values  increases 
monotonically  with  an  increase  in  the  number  of  training  samples  for  all  methods  (except  for  the 
sample  covariance  technique  in  the  K  <  N  regime  which  is  a  well-known  phenomena  observed  in  past 
work  as  well  [24])  and  2.)  the  RCML  estimator  exhibits  remarkably  good  performance  in  all  training 
regimes.  Similar  trends  are  observed  for  the  estimator  variance  as  well  in  Fig.  1.3  (b)  except  that 
we  see  a  monotonic  decrease  instead.  Again,  the  RCML  estimator  is  consistently  the  best  except  for 
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Normalized  SINR  in  dB  vs.  angle:  300  training  samples 


Normalized  azimuthai  angle 

(a) 


Normalized  SiNR  in  dB  vs.  Doppler:  300  training  samples 


(b) 


Normalized  SiNR  in  dB  vs.  angle:  352  training  samples 


Normalized  SINR  in  dB  vs.  Doppier:  352  training  samples 


Normalized  doppier  freq 


-2S- 


(c) 


(d) 


Figure  1.2:  Normalized  SINR  vs.  normalized  azimuthal  angle  and  doppier  frequency,  respectively 
(true  ranges  of  azimuth  and  doppier  can  be  seen  in  Table  2.1).  Sample  covariance  matrix  (SMI), 
fast  maximum  likelihood  (FML),  LOOC  shrinkage  estimator  (LOOC),  eigencanceler  (EigC),  and  rank 
constrained  maximum  likelihood  (RCML)  estimators  are  the  methods  compared.  K  =  300  is  used  for 
(a)  and  (h),  K  =  352  is  used  for  (c)  and  (d),  K  =  750  is  used  for  (e)  and  (f),  and  finally  K  =  3000  is 
used  for  (g)  and  (h). 


K  =  60  samples  where  all  estimators  other  than  sample  covariance  are  very  close  and  no  clear  winner 
emerges. 


Rank  Sensitivity 

The  KASSPER  data,  the  clutter  rank  conforms  to  Eq.  (1.6)  -  the  Brennan  rule.  Eor  the  parameters 
used  in  our  experiments,  this  would  lead  to  a  predicted  ideal  rank  of  r  =  J  +  P— 1  =  42.  In  a  practical 
situation,  departures  from  the  ideal  behavior  are  expected  and  hence  we  explore  the  performance  our 
proposed  RCML  estimator  even  as  incorrect  rank  information  is  used. 

The  results  in  Fig.  1.4  demonstrate  the  robustness  of  RCML  to  perturbations  in  the  clutter  rank. 
Fig.  1.4  presents  bar  graphs  that  show  averaged  SINR  results  for  K  =  352  and  K  =  750  training 
samples.  We  determined  numerically  that  the  “true”  rank  of  the  clutter  covariance  for  the  range 
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Normalized  SINR  in  dB  vs.  angle;  750  training  samples 


(e) 


Normalized  SiNR  in  dB  vs.  Doppler;  750  training  samples 


(f) 


Normalized  SINR  in  dB  vs.  angle;  3000  training  samples 


(g) 


Normalized  SINR  in  dB  vs.  Doppler;  3000  training  samples 


(h) 


Figure  1.2:  Normalized  SINR  vs.  normalized  azimuthal  angle  and  doppler  frequency,  respectively 
(true  ranges  of  azimuth  and  doppler  can  be  seen  in  Table  2.1).  Sample  covariance  matrix  (SMI), 
fast  maximum  likelihood  (FML),  LOOC  shrinkage  estimator  (LOOC),  eigencanceler  (EigC),  and  rank 
constrained  maximum  likelihood  (RCML)  estimators  are  the  methods  compared.  K  =  300  is  used  for 
(a)  and  (h),  K  —  352  is  used  for  (c)  and  (d),  K  =  750  is  used  for  (e)  and  (f),  and  finally  K  =  3000  is 
used  for  (g)  and  (h). 


bin  of  choice  was  in  fact  43  which  is  a  mild  departure  from  the  42  predicted  by  the  Brennan  rule. 
Comparisons  are  made  between  FML  and  RCML  with  the  difference  that  seven  variants  of  RCML 
are  presented  -  with  rank  from  34  to  45.  As  Fig.  1.4  reveals,  using  the  true  rank  of  43  indeed  yields 
the  best  covariance  matrix  estimator  but  the  penalty  of  the  small  departure,  i.e.  using  a  rank  of  40 
to  45  which  are  close  to  the  true  rank  43  leads  to  a  very  small  performance  loss.  On  the  other  hand. 
Figs.  1.4  also  shows  variants  of  the  RCML  result  with  a  somewhat  bigger  departure,  i.e.  a  rank  of  34. 
In  this  case,  the  performance  of  RCML  with  rank  34  is  appreciably  lower  against  using  rank  values 
around  the  true  rank  43.  Remarkably,  RCML  with  rank  34  is  still  competitive  with  FML.  Overall  Fig. 
1.4  therefore  provides  two  valuable  insights:  I.)  since  rank  information  is  predicted  using  the  Brennan 
rule  -  small  departures  in  practice  are  possible  and  our  estimator  exhibits  desirable  robustness  against 
such  small  perturbations  to  rank,  and  2.)  the  value  of  using  the  rank  information  is  simultaneously 
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Normalized  SINR  in  dB  vs.  the  number  of  traning  samples 


Estimation  error  variance  vs.  the  number  of  traning  samples 


The  number  of  training  samples 


60  100  200  250  300  352  750  3000 

The  number  of  training  samples 


(a) 


(b) 


Figure  1.3:  Normalized  SINR  and  Estimator  variance  vs.  the  number  of  training  samples.  The  used 
numbers  are  60,  100,  200,  250,  300,  352,  750,  and  3000  (a)  Normalized  SINR  and  (b)  Estimator  error 
variance. 


Figure  1.4:  Normalized  SINR  of  rank  constrained  maximum  likelihood  (RCML)  for  various  rank 
information 


revealed  -  because  RCML  with  rank  34  is  competitive  with  FML,  it  shows  that  FML  significantly 
underestimates  the  true  rank  in  these  examples. 


Practical  Merits  of  the  RCML  Estimator 

The  experiments  in  Section  1.4.2  to  Section  1.4.2,  however,  assume  that  we  have  access  to  homogeneous 
training  samples,  which  is  often  not  available  in  practice.  This  section  provides  more  realistic  and 
challenging  practical  evaluation  by  means  of  two  new  flavors  of  experimental  results:  1.)  plots  of 
probability  of  detection  versus  SNR  for  a  variety  of  detection  statistics,  and  2.)  normalized  SINR 
performance  in  the  presence  of  heterogeneous  training  samples  which  are  corrupted  by  the  target 
information.  Here,  we  consider  an  experimental  environment  which  reflects  real-world  scenarios  by 
considering  non-homogenous  training.  We  perform  two  experimental  investigations.  First,  we  examine 
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if  incorporating  rank-information  really  leads  to  better  target  detection.  Second,  robustness  to  target 
contamination  in  training  samples  is  investigated  -  while  outlier  removal  techniques  have  been  proposed 
[10,  45],  in  practice  target  contamination  of  training  data  cannot  be  entirely  ruled  out.  We  evaluate 
and  compare  three  different  covariance  estimation  techniques,  SMI,  FML  and  our  proposed  RCML.  We 
show  that  the  RCML  estimator  can  still  outperform  alternatives  in  that  detection  probability  is  second 
only  to  the  theoretic  upper  bound  when  the  true  covariance  is  known,  and  the  rank  information  is 
invaluable  in  yielding  meaningful  estimates  even  as  almost  all  available  training  samples  are  corrupted. 


Probability  of  Detection  vs.  SNR  We  apply  three  test  statistics,  the  normalized  matched  filter 
(NMF),  the  adaptive  matched  hlter  (AMF)  [2],  and  the  generalized  likelihood  ratio  test  (GLRT)  [46]. 
The  test  statistics  are  given  by 


NMF: 

,  AMF: 
GLRT: 


|s^R-ie|2 


Hi 

^  Anmf 

(s^R-is)(e^R-ie)  Ho 
|s^R-i  ^ 


s^R“is 


^  Aamf 

Ho 


s^R'^s^l  -I-  ^e^R'^e 


Hi 

^  ATAglrt 
Ho 


(1.24) 


where  s,  R,  e,  and  K  are  the  steering  vector,  the  estimated  covariance  matrix,  the  observation  vector, 
and  the  number  of  training  samples,  respectively.  The  detection  probability  Pd  is  defined  as  the 
probability  that  the  value  of  test  statistic  is  greater  than  a  threshold  conditioned  on  the  hypothesis 
that  the  received  data  includes  target  information.  Therefore,  it  depends  on  signal  to  noise  ratio 
(SNR,  by  virtue  of  s,)  and  the  estimated  covariance  matrix.  Since  Pd  does  not  typically  admit  a 
closed  form,  we  first  generate  a  number  of  samples  from  the  L-band  data  set  of  KASSPER  program 
to  determine  A  corresponding  to  the  fixed  false  alarm  rate  and  then  employ  Monte  Carlo  simulations 
to  evaluate  Pd  corresponding  to  each  estimator  for  each  of  the  test  statistics.  We  set  a  constant  false 
alarm  rate  to  10“^. 

Fig.  1.5  shows  the  detection  probability  Pd  plotted  as  a  function  of  SNR  for  different  estimators  and 
detection  statistics.  Figs.  1.5  (a)  and  (b)  plot  Pd  for  AMF  test.  Figs.  1.5  (c)  and  (d)  are  corresponding 
to  the  NMF  test,  and  Figs.  1.5  (e)  and  (f)  plots  results  for  the  GLRT.  We  use  K  =  N  =  352  and 
K  =  2N  =  704  training  samples  to  estimate  the  covariance  matrix  for  each  of  the  test  statistics. 
Figs.  1.5  (a),  (c),  and  (e)  are  for  K  =  352  and  Figs.  1.5  (b),  (d),  and  (f)  are  for  K  =  704.  It  is 
well-known  that  K  =  2N  training  samples  are  needed  to  keep  the  performance  within  3dB.  Indeed, 
we  can  see  that  the  sample  covariance  matrix  has  about  3dB  loss  vs.  the  true  covariance  matrix  in  all 
of  test  statistics.  The  proposed  RCML  estimator  is  the  closest  to  the  Pd  achieved  by  using  the  true 
covariance  matrix  (upper  bound)  and  FML  follows  RCML.  As  expected,  each  estimator  shows  higher 
detection  probability  when  K  =  2N  vs.  K  =  N,  i.e.  an  increase  in  training.  Note  finally  that  the 
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(e)  (f) 


Figure  1.5:  Probability  of  detection  vs.  SNR.  Sample  covariance  matrix  (SMI),  fast  maximum  likeli¬ 
hood  (FML),  rank  constrained  maximum  likelihood  (RCML)  estimators  are  the  methods  compared. 
K  =  352  is  used  for  (a),  (c),  and  (e)  and  K  =  704  is  used  for  (b),  (d),  and  (f).  (a)  and  (b)  are  AMF 
test  performance,  (c)  and  (d)  are  NMF  test  performance,  and  (e)  and  (f)  are  GLRT  test  performance. 


RCML  estimator  performs  the  best  no  matter  which  test  statistic  is  applied  and  in  every  regime  of 
training. 

Robustness  to  Nonhomogeneous  Training  Samples  We  investigate  two  different  scenarios  to 
evaluate  robustness  to  nonhomogeneous  training  samples.  First,  we  fix  the  ratio  of  the  number  of 
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Normalized  SINR  in  dB.  K=352,  Pj=0.2  Normalized  SINR  in  dB.  K=352,  alpha=50 
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Normalized  SINR  in  dB.  K=704,  P,=0.2 


(c) 


Normalized  SINR  in  dB.  K=704,  alpha=50 
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Figure  1.6:  Normalized  SINR  in  dB.  vs.  target  intensity  a  and  percentage  corruption  P*.  Sample 
covariance  matrix  (SMI),  fast  maximum  likelihood  (FML),  rank  constrained  maximum  likelihood 
(RCML)  estimators  are  the  methods  compared.  K  =  352  is  used  for  (a)  and  (b),  and  K  =  704  is  used 
for  (c)  and  (d). 


corrupted  samples  including  target  information  to  target-free  samples.  This  ratio  is  given  by 


Pf.  = 


the  number  of  corrupted  samples  by  target  information 
K{=  the  number  of  total  training  samples) 


(1.25) 


and  the  intensity  of  target  signal  by  a,  that  is,  the  received  data  z  can  be  expressed  by 


z  =  as{9t,ft)  +  d 


(1.26) 


where  d  =  c  -|- j  -|-  n  represents  the  overall  disturbance  which  is  the  sum  of  c,  clutter,  j,  jammers,  n, 
the  background  white  noise,  and  comes  from  a  zero-mean  complex  circular  Gaussian  distribution,  s 
is  a  known  spatio-temporal  steering  vector  [40]  which  is  drawn  from  a  distribution  independent  of  d. 
In  particular,  we  examine  performance  as  the  percentage  of  corrupted  samples,  i.e.,  Pt  is  varied  while 
keeping  a  fixed  intensity  of  the  target  signal,  a.  Our  second  investigation  involves  varying  a  for  a 
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fixed  Pt- 

We  use  two  evaluation  measures:  the  normalized  SINK  and  a  trace  deviation  measure  -  TRD(R). 
1.)  Normalized  SINK:  The  normalized  SINK  measure  [44]  is  commonly  used  in  the  radar  literature 
and  is  given  by 


js^R 

|s^R-iRR-is||s«R-is| 


(1.27) 


where  s  is  the  spatio-temporal  steering  vector,  R  is  an  estimated  covariance  matrix,  and  R  is  the  true 
covariance  matrix.  It  is  easily  seen  that  0  <  r;  <  1  and  77  =  1  if  and  only  if  R  =  R.  The  SINK  is 
plotted  in  dB.  in  all  figures  in  this  chapter,  that  is,  SINRdB  =  lOlog^g??.  Therefore,  SINRdB  <  0. 

Fig.  1.6  presents  bar  graphs  that  show  averaged  SINR^b  results  for  K  =  352  and  K  =  704 
training  samples.  Because  the  steering  vector  is  a  function  of  both  azimuthal  angle  and  Doppler 
frequency,  we  evaluate  the  normalized  SINR  in  both  angle  and  Doppler  domain  and  average  over 
both  domains  to  get  the  normalized  SINR  value  represented  by  each  bar.  Figs.  1.6  (a)  and  (b)  are 
corresponding  to  K  =  N  =  352  and  Figs.  1.6  (c)  and  (d)  plots  results  for  K  =  2N  =  704.  In 
particular.  Figs.  1.6  (a)  and  (c)  plot  the  variation  of  the  normalized  SINR  for  varying  intensity  of 
the  steering  vector  a,  where  a  is  varied  from  as  low  as  0  to  as  high  as  50.  We  fixed  Pt  =  0.2  in 
these  plots.  Two  trends  are  evident  from  Figs.  1.6  (a)  and  (c):  1.)  as  intuitively  expected,  the  SINR 
values  decreases  monotonically  with  an  increase  in  a  for  all  methods  (except  for  the  sample  covariance 
technique  in  the  K  =  N  regime)  and  2.)  the  RCML  estimator  exhibits  appreciably  good  performance 
in  all  training  regimes.  Figs.  1.6  (b)  and  (d)  plot  the  SINR  performance  for  varying  Pt  where  a 
remains  a  constant,  a  =  50.  The  range  of  Pt  is  from  0  (no  target  corruption)  to  1  (all  the  samples  are 
corrupted  by  target  information).  Similar  trends  are  observed  as  well  in  Figs.  1.6  (b)  and  (d).  Again, 
the  RCML  estimator  consistently  outperforms  the  other  methods.  An  interesting  observation  is  that 
SINRDB  drops  more  rapidly  as  a  function  of  increasing  Pt  vs.  increasing  a,  which  reveals  that  Pt  is 
a  more  critical  factor  than  a  in  influencing  estimation  with  heterogeneous  training. 

2.)  TRD(R):  We  define  a  trace  deviation  measure  -  TRD(R)  =  |tr{RR  ^}/N  —  1|  that  is  an 
alternate  way  of  evaluating  the  performance  of  covariance  matrix  estimators.  Intuitively,  we  can  see 
tr{RR~^}/iV  =  I  when  R  =  R.  Therefore,  we  can  say  the  goal  of  estimation  is  to  tray  and  keep 
TRD(R)  as  small  as  possible,  ideally  close  to  0.  Figs.  1.7  shows  plots  bar  graphs  in  the  same  training 
regime  as  Figs.  1.6.  We  plots  values  of  TRD(R)  for  varying  a  and  Pt  and  the  number  of  training 
samples  are  K  =  352  and  K  =  704. 

We  can  observe  trends  similar  to  those  in  Fig.  1.6.  The  TRD  values  monotonically  increase  as 
a  and  Pt  increase  for  all  methods.  The  proposed  RCML  estimator  consistently  outperforms  other 
techniques  considered  in  all  experiments.  Additionally,  the  merits  of  RCML  in  robust  estimation  are 
brought  out.  The  TRD  values  corresponding  to  both  the  sample  covariance  matrix  and  the  FML 
estimator  increase  quite  dramatically  with  an  increase  in  a  and,  especially,  Pt-  However,  in  the  case 
of  the  RCML  estimator  this  increase  is  more  gradual.  The  TRD  values  corresponding  to  RCML  in 
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TRD(R^gj),  K=352,  alpha=50 


TRD(Rha,),  K=352,  P(=0.2 


(a) 


(b) 


Figure  1.7:  TRD(R)  vs.  target  intensity  a  and  percentage  corruption  Pt.  Sample  covariance  matrix 
(SMI),  fast  maximum  likelihood  (FML),  rank  constrained  maximum  likelihood  (RCML)  estimators 
are  the  methods  compared.  K  =  352  is  used  for  (a)  and  (b),  and  K  =  704  is  used  for  (c)  and  (d). 


Fig.  1.7  (d)  are  in  fact  still  close  to  0  even  under  severe  target  corruption,  i.e.  Pt  =  1. 


1.5  Conclusions 

We  developed  a  new  estimator  of  structured  covariance  matrices  (identity  plus  a  positive  semi-definite 
component)  which  employs  rank  of  the  positive  semi-definite  matrix  as  an  explicit  constraint  in  ML 
estimation.  In  radar  applications,  the  rank-deficient  component  corresponds  to  the  clutter  and  its 
rank  can  be  determined  using  the  Brennan  rule  for  airborne  radar  interacting  with  land  clutter.  We 
demonstrated  that  despite  the  presence  of  the  challenging  rank-constraint,  the  estimation  problem 
can  in  fact  be  reduced  to  a  convex  optimization  problem  and  admits  a  closed  form  solution.  Exper¬ 
imentally,  rank  information  plays  a  vital  role  and  rigorous  evaluation  over  the  KASSPER  data  set 
establishes  merits  of  the  proposed  estimator  when  evaluated  via  powerful  figures  of  merit  such  as  nor- 
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malized  SINK  and  estimator  variance.  Further,  significant  practical  merits  are  revealed  in  challenging 
real-world  radar  detection  and  estimation  set-ups.  Future  work  could  consider  the  incorporation  of 
more  constraints  on  the  clutter/disturbance  matrix  such  as  Toeplitz  structure  as  well  as  the  use  of 
physically  inspired  probabilistic  priors  in  a  Bayesian  setting. 
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Chapter  2 


Efficient  Approximation  of 
Structnred  Covariance  under  Joint 
Toeplitz  and  Rank  Constraints 

2.1  Summary 

Disturbance  covariance  estimation  is  a  centrally  important  problem  in  radar  space  time  adaptive 
processing  (STAP).  Because  training  is  invariably  scarce,  estimators  that  exploit  inherent  structure 
and  physical  radar  constraints  are  needed  in  practice.  This  chapter  develops  a  new  computationally 
efficient  estimator  which  jointly  enforces  a  Toeplitz  structure  and  a  rank  constraint  on  the  structured 
interference.  Previous  work  has  shown  that  exact  ML  estimation  of  Toeplitz  covariance  matrix  has 
no  closed  form  solution  and  most  versions  of  this  problem  result  in  iterative  estimators  which  are 
computationally  expensive.  Our  proposed  solution  focuses  on  a  computationally  efficient  approxima¬ 
tion  and  involves  a  cascade  of  two  closed  form  solutions.  First,  we  obtain  the  rank  constrained  ML 
estimator  (RCML)  whose  merits  have  recently  been  established  hrmly  for  radar  STAP.  The  central 
contribution  of  this  chapter  is  the  rank  preserving  Toeplitz  approximation,  which  we  demonstrate  can 
be  modeled  as  an  equality  constrained  quadratic  program  and  also  admits  a  closed  form.  Extensive 
performance  evaluation  on  both  simulated  and  KASSPER  data  confirms  that  the  proposed  estimator 
yields  unbeatable  performance  for  radar  STAP  under  the  previously  stated  conditions  of  rank  and 
Toeplitz  constraints. 
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2.2  Introduction 


Radar  systems  using  multiple  antenna  elements  that  coherently  process  multiple  pulses  offer  signif¬ 
icant  benefits  in  many  applications.  The  directivity  and  resolution  limits  of  a  single  sensor  can  be 
overcome  by  using  an  adaptive  array  of  spatially  distributed  sensors  makes  multiple  temporal  snap¬ 
shots  processing  possible.  Specifically,  joint  adaptive  processing  in  the  spatial  and  temporal  domains 
[44,  47,  48]  called  space  time  adaptive  processing  (STAP)  creates  an  ability  to  suppress  interference 
signals  while  simultaneously  preserving  gain  on  the  desired  signal.  For  STAP  to  be  successful  though, 
interference  statistics,  in  particular  the  covariance  matrix  of  the  disturbance  or  interference  must  be 
estimated  from  target  free  training  data,  and  therefore  training  plays  a  pivotal  role  in  adaptive  radar 
systems. 

To  obtain  accurate  estimates  of  the  disturbance  covariance  matrix,  a  large  number  of  homoge¬ 
neous  (target  free)  disturbance  training  samples  are  required  in  the  absence  of  any  prior  knowledge 
about  the  interference  environment.  A  compelling  challenge  for  radar  STAP  emerges  since  generous 
homogeneous  training  is  often  not  available  in  practice  [7].  This  problem  is  exacerbated  because  the 
estimation  process  must  be  repeated  for  each  range  bin  of  interest.  Much  recent  research  in  radar 
STAP  has  been  proposed  to  overcome  the  lack  of  generous  homogeneous  training.  One  approach  to 
this  problem  uses  a  priori  information  about  the  radar  environment  and  is  widely  referred  to  in  the 
literature  as  knowledge-based  processing  [29,  49,  50,  40,  51,  33,  34,  52].  A  subset  of  this  technique 
deals  with  intelligent  training  selection  for  reducing  both  the  number  of  required  training  samples  and 
computational  cost  [40,  12,  6].  Another  approach  to  improve  the  target  detection  performance  is  data 
selection  screening  among  the  training  data  to  excise  potential  outliers  [53,  54]. 

Covariance  matrix  estimation  techniques  that  enforce  and  exploit  specific  structure  inherent  to  the 
disturbance  phenomenon  have  merit  in  the  regime  of  extremely  limited  training  data.  Examples  of 
structure  include  persymmetry  [17],  eigenstructure  [24,  42],  circulant  structure  [21],  rank  constraint 
[55,  56],  multichannel  autoregressive  models  [13,  22],  physical  constraints  [23]  and  so  on.  In  par¬ 
ticular,  since  the  covariance  matrix  from  a  stationary  stochastic  signal  is  Hermitian  and  Toeplitz, 
estimating  Toeplitz  covariance  benefits  many  applications  such  as  array  processing  and  time  series 
analysis.  Such  a  Hermitian  Toeplitz  matrix  models  the  covariance  of  a  random  vector  obtained  by 
sampling  a  wide  sense  stationary  noise  field  with  a  uniform  linear  array  and  uncorrelated  narrow-band 
interferers  [18].  The  seminal  work  by  Burg  et  al.  [57]  proposed  an  iterative  method  for  estimation 
of  structured  covariance  matrices  using  the  ML  method  in  its  full  generality  .  Li  et  al.  developed 
the  asymptotic  maximum  likelihood  (AML)  estimation  for  structured  covariance  matrices  [19]  using 
the  extended  invariance  principle  (EXIF)  [58].  Approximation  of  arbitrary  matrices  by  a  (Hermitian) 
Toeplitz  matrix  using  matrix  decompositions  and  outer  approximations  has  separately  been  pursued 
in  applied  mathematics  [59,  60,  61,  62].  While  the  techniques  in  [59,  60,  61,  62]  were  not  conceived 
for  signal  processing  or  radar  STAP,  they  can  potentially  be  used  in  conjunction  with  classical  covari- 
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ance  estimation.  Of  particular  interest  is  Al-Homidan’s  h  sequential  quadratic  programming  (SQP) 
method  to  find  the  nearest  symmetric  positive  semi-definite  Toeplitz  matrix  to  given  a  matrix  [59]. 

2.2.1  Motivation  and  Challenges 

Various  estimation  and  approximation  techniques  of  Toeplitz  covariance  matrices  have  been  proposed 
[63,  64,  65,  66].  It  is  well  known  [18]  though  that  there  is  no  closed-form  solution  for  the  ML  estimation 
of  a  Hermitian  Toeplitz  covariance  matrix.  Many  Toeplitz  covariance  estimation  techniques  need 
the  assumption  of  large  sample  size  (i.e.  observed  training)  for  computational  tractability  [19], [66]. 
In  the  regime  of  realistic  training,  methods  rely  on  numerical  optimization  (often  non-convex),  are 
computationally  involved  and  hence  unsuitable  for  real-time/practical  deployment. 

Previous  works,  notably  in  statistics  [67,  68]  (and  references  therein)  have  also  shown  that  the  rank 
of  the  structured  interference  can  be  exploited  in  a  tractable  manner.  Rank  is  a  powerful  constraint 
in  covariance  estimation  and  can  often  be  determined  via  underlying  radar  physics.  Under  nominal 
assumptions,  the  Brennan  rule  [1]  may  be  used  to  determine  the  rank  of  the  structured  interference. 
Related  work  also  addresses  the  problem  of  determining  rank  in  non- ideal  scenarios  [69].  Recently, 
Kang  et  al.  proposed  the  rank  constrained  ML  (RCML)  estimation  of  structured  covariance  matrices 
[70]  which  exploits  the  knowledge  of  the  radar  noise  floor.  Kang  et  al.  [70]  also  report  another  estimator 
called  RCMLlb  for  the  case  when  the  noise  floor  is  assumed  unknown  and  only  a  lower  bound  (LB) 
is  available.  The  RCMLlb  estimator  generalizes  the  well-known  result  in  statistics  [67,  68].  In  the 
radar  context  though,  the  noise  variance  is  assumed  known  since  it  can  be  determined  by  placing  the 
radar  in  receive  only  mode  [71].  Notable  contributions  which  deal  with  both  the  rank  information 
and  Toeplitz  structure  of  the  covariance  matrix  jointly  includes  the  iterated  Toeplitz  approximation 
method  (ITAM)  [72]  proposed  by  Wilkes  and  Hayes  and  the  iterative  approach  by  Forster  et  al.  [65] . 
Both  approaches  are  based  on  a  computationally  expensive  iterative  procedure.  The  ITAM  estimator 
in  particular  has  been  shown  to  be  effective  under  very  low  training  because  of  its  ability  to  exploit 
structure  but  does  not  yield  scalable  performance  improvements  as  realistic  or  generous  training  is 
made  available. 

2.2.2  Our  contributions 

It  may  be  inferred  that  for  adequate  performance  under  limited  training,  computationally  involved 
estimators  such  as  ITAM  [72]  are  needed  but  online  covariance  estimation  is  often  needed  in  near 
real-time.  While  fast,  closed  form  estimators  such  as  AML  [19]  can  be  used,  they  do  not  excel 
under  low  or  realistic  training.  Our  contribution  aims  to  break  this  classical  trade-off.  We  develop 
a  computationally  efficient  approximation  of  structured  covariance  under  joint  Toeplitz  and  rank 
(EASTR)  constraints.  Specifically,  our  key  contributions  are  listed  next. 
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•  Analytically  tractable  framework  for  exploiting  both  Toeplitz  structure  and  the 
rank  of  the  structured  interference.  Our  proposed  estimator,  i.e.  EASTR,  satisfies  both 
Toeplitz  structure  property  (at  least  approximately)  and  the  rank  information  of  the  structured 
interference  at  the  same  time.  Decades  of  research  has  shown  that  enforcing  even  each  constraint 
individually  can  be  quite  onerous  (e.g.  rank  is  a  non-convex  constraint  and  no  known  closed  form 
exists  under  the  Toeplitz  constraint  for  all  training).  The  rank  constrained  ML  estimation  has 
been  achieved  recently  though  [70]  via  a  transformation  of  variables.  However,  this  does  not 
apply  when  the  Toeplitz  constraint  is  added.  We  propose  to  decouple  the  rank  and  Toeplitz 
constraints,  which  lends  analytical  tractability.  Crucially,  the  EASTR  solution  does  not  need 
iterative  steps  like  ITAM  and  as  will  be  established  in  Section  2.4,  Furthermore,  our  results 
demonstrate  that  EASTR  consistently  outperforms  ITAM. 

•  Computationally  efficient  and  fast  estimation  and  approximation.  Our  proposed 
method,  EASTR,  essentially  involves  a  cascade  of  two  steps  where  a  closed  form  solution  is 
available  in  each  step.  First  a  closed  form  solution  using  maximum  likelihood  employing  the 
rank  constraint  is  obtained  from  the  RCML  [70]  estimator.  Next,  we  propose  a  new  method  to 
perturb  the  eigenvalues  of  the  RCML  estimator  in  a  rank  preserving  manner  so  as  to  impose 
the  Toeplitz  structure.  We  formulate  a  new  quadratic  programming  (QP)  optimization  problem 
that  solves  for  the  eigenvalues  while  incorporating  Toeplitz  constraints  and  demonstrate  that 
this  problem  also  admits  a  closed  form  solution. 

•  Experimental  insights  and  improved  performance  in  low  training  regimes.  The  merits 
of  EASTR  are  also  verified  experimentally  over  both  simulated  data  and  realistic  data  sets 
such  as  Knowledge  Aided  Sensor  Signal  Processing  and  Expert  Reasoning  (KASSPER).  ITAM 
works  well  particularly  in  low  training  regimes  but  is  numerically  expensive.  The  asymptotic 
ML  estimation  gives  us  a  fast  closed  form  solution  but  shows  good  performances  only  in  high 
training  regimes.  EASTR  excels  across  all  training  regimes  while  still  permitting  closed  form 
solutions  attractive  for  practical  deployment. 

We  consider  two  cases:  1.)  when  the  Toeplitz  constraint  is  satisfied  exactly,  we  obtain  the  exact 
Toeplitz  estimate  satisfying  the  rank  constraint  and  Toeplitz  property  and  2.)  when  the  Toeplitz 
constraint  is  not  exactly  satisfied,  we  make  slight  modification  on  the  Toeplitz  constraint  and  derive  a 
modified  optimization  problem  to  obtain  approximately  Toeplitz  estimate.  In  practice,  the  available 
data  dictates  which  of  the  two  cases  in  invoked.  Experimental  investigation  shows  that  the  EASTR 
can  outperform  alternatives  in  the  sense  of  1.)  normalized  SINR  and  2.)  the  probability  of  detection, 
and  3.)  a  newly  proposed  trace  deviation  measure. 
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2.3  Methods,  Assumptions,  and  Procedures 


The  maximum  likelihood  covariance  estimate  R  is  one  which  maximizes  the  likelihood  function  based 
on  a  zero-mean  complex  circular  Gaussian  distribution: 

/(R)(Z)  =  exp  (  -  tr{Z"R-iZ})  (2.1) 

under  both  Toeplitz  and  rank  constraints.  In  (2.1),  K  is  the  number  of  training  samples,  N  is  the 
dimension  of  observations,  and  Z  is  an  iV  x  it"  matrix  whose  each  column  is  an  i.i.d.  observation 
vector.  With  some  algebraic  manipulations,  the  final  optimization  problem  may  be  written  as 


min 

R 

tr{R-iS}  +  log(lRl) 

S.t. 

R  =  (T^I  +  R, 

rank(Rc)  =  r 

[  Rc  €  T 

where  S  =  ^ZZ^  is  the  sample  covariance  matrix,  R^  denotes  the  interference  covariance  matrix,  I 
is  an  X  identity  matrix,  and  is  the  radar  noise  floor  which  can  be  readily  determined  using 
standard  techniques  [71] ,  and  lastly  T  is  the  set  of  all  A^  x  A^  Hermitian  positive  semi-definite  Toeplitz 
matrices, 

T  =  {T  :  T  e  =  T,  T  h  0  and  T  e  T}  (2.3) 

where  T  is  the  set  of  all  Toeplitz  matrices.  The  optimization  problem  (2.2)  is  particularly  hard  to 
solve  because  1.)  the  problem  in  not  convex  (and  no  known  transformations  exist  to  turn  it  into  one); 
hence  a  global  minimizer  is  virtually  impossible  to  find,  2.)  from  a  numerical  standpoint,  solutions 
are  known  to  be  computationally  burdensome  under  the  Toeplitz  constraint  alone  [57],  [72].  Adding 
the  rank  constraint  only  exacerbates  the  problem. 

In  view  of  the  aforementioned  challenges,  we  focus  on  covariance  matrix  estimation  that:  1.)  is 
fast  and  based  on  analytical  closed  forms  so  as  to  facilitate  practical  deployment,  and  2.)  exploits 
previously  known  insights  in  radar  STAP  so  that  performance  in  the  sense  of  high  SINK  and  can 
be  obtained  across  all  training  regimes. 

Our  proposed  solution  decouples  the  rank  and  Toeplitz  constraints,  and  develops  a  cascade  of 
two  closed  forms  as  the  final  estimator.  The  first  closed  form  is  obtained  by  employing  the  recently 
proposed  rank  constrained  ML  (RCML)  estimator  of  structured  covariance  [70].  The  hnal  RCML 
solution  is  given  by  [70] 

R*  =  cr^x*-^  =  (2.4) 

where  $  is  the  eigenvector  matrix  of  the  sample  covariance  matrix  S  and  A*  is  a  diagonal  matrix 
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with  optimal  diagonal  entries  A*  which  is  given  by 


{min(l,  — )  for  i  =  1,  2, . . . ,  r 

d^  (2.5) 

1  for  i  =  r  +  1,  r  +  2, . . . ,  TV 

where  di  is  the  ith  eigenvalue  of  the  sample  covariance  matrix  normalized  by  cr^,  S'  =  ^S. 

2.3.1  Conditions  for  Eigenvalnes  of  Toepiltz  Covariance 

Our  approach  now  involves  enforcing  the  Toeplitz  structure  on  top  of  the  RCML  estimator  in  (2.5). 
Let  the  eigenvector  matrix  of  S  be  and  the  eigenvalues  of  Rc  be  Ai,  A2, . . . ,  A^, . . . ,  Aat.  Since  we 
want  to  preserve  the  clutter  rank  constraint  ranfc(Rc)  =  r,  Rc  should  have  only  r  positive  eigenvalues 
and  the  rest  of  them  should  be  zero,  that  is 


Ai  ^  A2  ^  ^  Ac  >  Ac+i  —  Ac+2  —  ■  ■  ■  —  A]v  —  0 

Therefore,  Rc  can  be  expressed  as 

Rc  = 

where 


Ai 

0  ••• 

0 

0 

...  0 

0 

A2 

0 

0 

...  0 

0 

0 

0 

0 

...  0 

A  = 

0 

0  0 

Ac 

0 

...  0 

0 

0  ••• 

0 

0 

...  0 

0 

0  ••• 

0 

0 

0 

L  0 

0  ••• 

0 

0 

...  0 

^^'11 

4>12 

</'l3 

4>in 

4>21 

4>22 

4’23 

4>2N 

$  = 

(t>31 

4>32 

4>33 

4>3N 

1 - 

4>N2 

4’N3 

4>nn 

Therefore,  we  know  that  ijth  component  of  Rc  is  given  by 


(Rc)ij  —  'y  '  ^k4’ik4’jk 


(2.6) 


(2.7) 


(2.8) 


(2.9) 


(2.10) 
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Note  that  Rc  is  already  Hermitian,  that  is,  (Rc)ij  =  (Rc)Ji-  Now  in  order  for  Rc  to  be  Toeplitz 
matrix,  all  entries  on  each  diagonal  in  the  lower  triangular  part  in  Rc  must  have  same  values,  i.e.,  the 
following  equations  must  hold. 


II 

(Rc)22 

=  •  •  •  =  {'Rc)nN 

II 

(Rc)32 

=  ■■■  =  {IIc)n,N-1 

(Rc)31  — 

(Rc)42 

=  ■  ■  ■  =  {IIc)n,N-2 

(2.11) 

(Rc)Ar-l,l 

=  (Rc)Ar2 

Let  us  examine  the  hrst  condition  in  (2.11),  (Rc)ii  =  (Rc)22) 


r  r 

^k4'lk4'lk  =  X]  ^k4>2k4’2k 

fc=l  k=l 


It  can  be  also  expressed  as 

r 

Xk{<Plk(p*lk  -  (t>2k(t>*2k)  =  0 

k^l 

In  vector  form,  the  first  equation  is  given  by 


4^1r4^lr  4^2r4^2r 


(2.12) 


(2.13) 


(2.14) 


Since  the  elements  (fiij  of  the  eigenvalue  matrix  $  are  known  ($  is  the  eigenvector  matrix  of  the 
sample  covariance  matrix),  we  now  have  the  first  constraint  for  Toeplitz  covariance  matrix  as  a  linear 
combination  of  the  eigenvalues.  Other  equations  in  Eqs.  (2.11)  also  can  be  expressed  in  a  vector  form 
as  in  (2.14).  Consequentially,  we  have  a  total  of  N{N  —  l)/2  equations  and  finally  get  the  following 
equation  which  is  the  equality  constraint  of  our  optimization  problem. 


’®'A  0 


(2.15) 


where  each  row  of  ^  G  i)/2xr  (Jej^otes  coefficients  of  which  come  from  each  of  equations  in 

iT 


Eqs.  (2.11)  and  A  = 


Ai  X2 


Ay. 


Since  ^  Eq.  (2.15)  is  a  tall  matrix,  (2.15)  in  general  is  a  overdetermined  linear  system,  that 
is,  we  have  more  equations  than  unknowns.  The  solution  set  therefore  depends  on  the  rank  of  ’®'. 
The  first  case  is  that  we  have  an  infinite  set  of  solutions  when  the  column  rank  of  HI  is  less  than  r. 
On  the  other  hand,  when  has  a  full  column  rank,  we  have  the  trivial  solution,  A  =  0.  That  is, 
the  covariance  matrix  can  only  be  made  approximately  (and  not  exactly)  Toeplitz  in  this  case  -  this 


DISTRIBUTION  A:  Approved  for  public  release 


30 


remedy  is  discussed  in  Section  2.3.3. 


2.3.2  Exact  Toeplitz  Solution 

When  the  column  rank  of  is  less  than  r,  Eq.  (2.15)  has  an  infinite  number  of  solutions.  In  this 
case,  we  can  obtain  the  exact  Toeplitz  solution.  First,  let  Arcml  be  the  eigenvalues  obtained  from  the 
RCML  estimation,  which  is  given  by  Eq.  (2.5).  We  already  know  the  eigenvalues  A  from  the  RCML 
estimate  are  the  optimal  ML  estimate  of  the  true  structured  covariance  matrix  under  only  the  rank 
constraint.  Therefore,  we  want  the  eigenvalues  of  the  interference  covariance  matrix  to  satisfy  Eq. 
(2.15)  and  to  be  as  close  to  the  RCML  solution  as  possible.  Since  Eq.  (2.15)  has  an  infinite  number  of 
solutions,  we  can  find  the  closest  vector  of  the  eigenvalues  to  Arcml  by  solving  the  following  convex 
optimization  problem. 

min  IIArcml  — A|p 

^  (2.16) 
subject  to  :  ’S'A  =  0 

The  optimization  problem  (2.16)  is  a  well  known  quadratic  programming  (QP)  optimization  problem 
with  an  equality  constraint  and  therefore  the  closed  form  solution  is  available  using  KKT  condition 
[39]  and  it  is  given  by  solving  the  following  equation. 


(2.17) 


where  u*  is  a  vector  of  Lagrange  multipliers. 

However,  the  matrix  on  the  left-hand  side  of  Eq.  (2.17)  is  actually  singular  because  tE  has  not  full 
column  rank.  So  we  introduce  a  new  matrix  instead  of  ’4'  to  make  the  left  matrix  invertible  when 
we  solve  it.  That  is. 


21 

■ 

A* 

2Arcml 

0 

IZ* 

0 

21 

4,T  ' 

A* 

2Arcml 

0 

u* 

0 

(2.18) 


where  is  a  matrix  consists  of  rank(\I')  linearly  independent  rows  of  ’®'.  Obviously,  Eq.  (2.17)  and 
Eq.  (2.18)  have  the  same  solution  because  linearly  independent  rank(4')  rows  of  determine  the  set 
of  solutions  of  the  equation  and  removing  redundant  rows  does  not  make  any  changes  to  the  solution. 
It  follows  that  the  final  closed  form  solution  using  blockwise  inversion  property  is  given  by 


A*  =  (I  - 

and  the  final  covariance  matrix  can  be  obtained  by 


(2.19) 


(2.20) 
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2.3.3  Toeplitz  Approximation 

In  the  case  that  has  a  full  column  rank,  Eq.  (2.15)  has  the  only  one  solution,  A  =  0,  which 
does  not  yield  a  meaningful  covariance  matrix.  In  this  case,  the  optimization  problem  to  enforce 
the  Toeplitz  structure  must  be  modified.  One  possibility  is  to  explicitly  incorporate  the  eigenvector 
matrix  into  the  optimization.  This  however,  will  lead  to  a  computationally  expensive  problem  because 
the  optimization  must  constrain  the  eigenvector  matrix  to  be  unitary.  Further,  using  an  eigenvector 
matrix  to  agree  with  i.e.  the  one  obtained  from  sample  covariance  has  been  known  to  be  very 
successful  in  radar  STAP  [47,  25,  70]. 

We  therefore  take  the  approach  of  building  an  approximately  as  opposed  to  exactly  Toeplitz  matrix. 
This  can  be  done  by  computing  the  closest  rank  deficient  matrix  to  tit.  Consider  the  singular  value 
decomposition  of  4^, 

^  =  USV^  (2.21) 

The  well-known  theorem,  Eckart- Young  theorem  [73],  says  that  a  matrix  ’4'  with  the  column  rank  less 
than  r  that  minimizes  —  ’S'jjF  is  given  by 

#  =  USV^  (2.22) 


where  S  is  the  diagonal  matrix  obtained  from  S  by  replacing  the  r-th  diagonal  element  which  is  the 
smallest  diagonal  element  by  zero.  By  substituting  with  ^  in  Eq.  (2.15),  we  obtain  the  infinite 
number  of  solutions  for  A.  Now,  the  optimization  problem  becomes 


m^in  IJArcml  -  AJl^ 

subject  to  :  ’®'A  =  0 


(2.23) 


Finally,  a  Toeplitz  matrix  is  obtained  by  solving  the  above  optimization  problem  in  the  same  way 
done  in  the  case  of  exact  Toeplitz  solution,  that  is. 


A’^  =  (I  -  Arcml  (2.24) 

where  ’S'  is  a  matrix  consists  of  r  —  1  linearly  independent  rows  of  ’S'. 

Remark:  It  should  be  noted  that  the  actual  rank  of  ’S'  which  is  derived  from  $  depends  on  the 
training  data.  If  the  true  covariance  is  indeed  Toeplitz,  we  expect  training  samples  to  reflect  that 
particularly  in  the  regime  of  77  »  N  training  samples  (asymptotic  regime),  this  is  indeed  what  we 
observe  in  practice. 
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2.4  Results  and  Discussions 


2.4.1  Experimental  Setup  and  Methods  Compared 

In  this  section,  we  compare  the  performance  of  proposed  estimator  against  state  of  the  art  Toeplitz 
STAP  estimators.  Two  data  sets  are  used:  1.)  A  radar  covariance  simulation  model  and  2.)  the  well 
known  KASSPER  [8]  data  set. 

First,  we  model  a  radar  system  with  an  A^-element  uniform  linear  array.  The  overall  disturbance 
is  composed  of  jammer  and  white  interference.  Therefore,  the  external  wideband  noise  environment 
via  its  input  covariance  matrix  can  been  modeled  by 

,7 

R(n,  to)  =  ^  sinc[0.5/?i(n  —  +  a^S(n,  to)  (2.25) 

where  n^m  £  {1,  • .  • ,  A^},  J  is  the  number  of  jammers,  of  is  the  power  associated  with  the  Ah  jammer, 
(pi  is  the  jammer  phase  angle  with  respect  to  the  antenna  phase  center,  Pi  is  the  fractional  bandwidth, 
cr^  is  the  actual  power  level  of  the  white  disturbance  term,  and  5(n,m)  has  the  value  of  1  only  when 
n  =  m  and  0  otherwise.  This  simulation  model  has  infact  been  widely  and  very  successfully  used  in 
previous  literature  [24,  25,  74,  33]  for  performance  analysis.  It  is  easily  seen  that  R  is  Hermitian  and 
Toeplitz  since  R(n,  to)  depends  on  only  n  —  m  and  sine  function  is  an  even  function.  In  addition, 
R  generally  has  a  rank  less  than  N.  Therefore,  this  model  can  not  only  be  used  to  simulate  radar 
disturbance  samples  but  also  makes  ground  truth  covariance  available. 

Data  from  the  L-band  data  set  of  KASSPER  program  is  the  other  data  set  used  in  our  experiments. 
Note,  the  KASSPER  dataset  also  makes  the  true  ground  truth  covariance  available  and  we  picked 
range  bins  such  that  their  covariance  matrices  were  exactly  or  approximately  Toeplitz.  The  L-band 
data  set  consists  of  a  data  cube  of  1000  range  bins  corresponding  to  the  returns  from  a  single  coherent 
processing  interval  from  II  channels  and  32  pulses.  Therefore,  the  dimension  of  observations  (or  the 
spatio  temporal  product)  A^  is  11  x  32  =  352.  Other  key  parameters  are  detailed  in  Table  2.1. 

We  compare  the  following  six  different  covariance  estimation  techniques:  A  host  of  competing 
techniques  like  FML,  Eigen-canceller,  and  shrinkage  estimators  have  been  compared  with  the  RCML 
method  in  [70].  The  results  of  [70]  demonstrate  that  RCML  ouperforms  these  techniques  under  all 
conditions  of  training  data  support  and  hence  they  are  not  reproduced  here. 

•  Sample  Covariance  Matrix:  The  sample  covariance  matrix  is  given  by  S  =  —  ZZ^.  It 

K 

is  well  known  that  the  sample  covariance  is  the  unconstrained  maximum  likelihood  estimator 
under  Gaussian  disturbance  statistics.  We  refer  to  the  use  of  this  technique  as  SMI. 

•  Iterated  Toeplitz  Approximation  Method:  The  iterated  Toeplitz  approximation  method 
(ITAM)  [72]  alternatively  estimates  a  rank  deficient  matrix  using  the  eigenvalue  decomposition 
and  then  make  the  resulting  matrix  Toeplitz  by  substituting  diagonal  entries  with  the  average 
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Table  2.1:  KASSPER  Dataset- 1  parameters 


Parameter 


Value 


Carrier  Frequency 
Bandwidth  (BW) 

Number  of  Antenna  Elements 

Number  of  Pulses 

Pulse  Repetition  Frequency 

1000  Range  Bins 

91  Azimuth  Angles 

128  Doppler  Frequencies 

Clutter  Power 

Number  of  Targets 

Range  of  Target  Dop.  Freq. 


1240  MHz 

10  MHz 

11 
32 

1984  Hz 

35  km  to  50  km 
87°,  89°,  ...  267° 

-992  Hz,  -976.38  Hz,  . . .,  992  Hz 
40  dB 

226  (  200  detectable  targets) 
-99.2  Hz  to  372  Hz 


value  of  themselves  for  each  diagonal  of  the  estimated  matrix.  After  that,  the  same  process 
is  repeated  until  the  estimated  Toeplitz  matrix  has  a  desired  rank.  The  estimated  covariance 
satisfies  both  a  desired  rank  and  Toeplitz  property  and  it  is  closer  to  the  true  covariance  matrix 
in  the  sense  of  Frobenius  norm  than  the  sample  covariance  matrix. 

Asymptotic  Maximum  Likelihood:  The  asymptotic  maximum  likelihood  (AML)  [19]  ex¬ 
ploits  Toeplitz  property  of  the  structured  covariance  matrix.  The  authors  derived  a  closed-form 
formula  for  Toeplitz  covariance  matrix  estimation  and  it  facilitates  computationally  efficient 
implementation.  However,  they  assumed  a  large  number  of  training  samples  and  their  closed- 
form  solution  is  asymptotically  valid.  That  is,  in  the  low/realistic  training  regime,  estimation 
performance  invariably  suffers. 

Rank  Constrained  ML  estimator:  The  RCML  estimator  [70]  has  been  recently  proposed 
and  exploits  the  clutter  rank  information  of  the  structured  covariance  matrix  but  not  Toeplitz 
property.  It  is  also  the  hrst  step  of  the  closed  form  solution  of  our  proposed  method. 

Sequential  Quadratic  Programming:  Al-Homidan  proposed  a  sequential  quadratic  pro¬ 
gramming  (SQP)  algorithm  to  find  the  nearest  symmetric  positive  semi-definite  Toeplitz  matrix 
to  given  a  matrix  [59].  There  are  many  other  Toeplitz  approximation  algorithms  in  applied 
mathematics  [60,  61,  62].  We  choose  the  SQP  algorithm  largely  because  it  guarantees  a  global 
minima  in  approximation  error  and  the  h  SQP  method  is  considerably  faster  [59]  than  alterna¬ 
tives.  In  practice,  the  estimator  is  developed  by  making  a  Toeplitz  approximation  to  the  RCML 
estimator.  This  makes  the  technique  analogous  to  our  proposal  of  decoupling  the  rank  and 
Toeplitz  constraints  in  Section  2.3.  However,  using  applied  math  approximations  in  a  ‘black¬ 
box’  manner  has  two  major  drawbacks:  1.)  the  approximation  may  not  necessarily  preserve 
rank  and  radar  STAP  specific  structure  (e.g.  eigenvector  matrix  is  perturbed  as  well),  and  2.) 
the  techniques  are  numerically  involved  particularly  with  an  increase  in  data  dimension. 
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(a) 


(b) 


Figure  2.1:  Trace  deviation  measure  vs.  the  number  of  independent  snapshots  for  (a)  simulation 
model  and  (b)  KASSPER  data  set. 


•  EASTR:  The  proposed  Efficient  Approximation  of  Structured  covariance  under  joint  Toeplitz 
and  Rank  (EASTR)  constraints.  It  incorporates  Toeplitz  structure,  the  rank  of  the  clutter 
component  as  well  as  the  STAP  structural  constraint. 

In  the  results  to  follow,  the  ITAM,  RCML,  SQP  and  EASTR  exploit  rank  information.  The 
clutter  rank  for  the  simulation  model  covariance  is  of  course  known  and  for  the  KASSPER  data  set 
was  inferred  via  the  Brennan  rule. 


2.4.2  Whiteness  Test 

Before  using  popular  radar  STAP  measures,  we  apply  a  ‘whiteness  test’.  The  trace  deviation  measure 
[56]  is  one  way  of  evaluating  covariance  matrix  estimators  since  it  captures  the  extent  to  which  the 
estimated  covariance  matrix  whitens  the  true  covariance  matrix.  It  is  given  by 

TRD(R)  =  |tr{R~iR}/A  -  1|  (2.26) 

Intuitively,  we  can  see  that  its  lower  bound  is  zero  when  R  =  R  and  smaller  value  of  TRD  means 
better  performance. 

Fig.  2.1  shows  bargraphs  of  the  performance  of  compared  methods  for  simulation  model  and 
KASSPER  data  set  respectively.  Fig.  2.1a  shows  bargraphs  of  the  performance  in  terms  of  TRD 
measure  versus  the  number  of  training  samples.  Because  the  SMI  and  the  AML  show  very  high  TRD 
values,  we  do  not  plot  them  in  Fig.  2.1a.  Fig.  2.1b  similarly  shows  the  result  of  TRD  measure  across 
three  training  regimes  for  the  KASSPER  data  set.  The  TRD  measure  results  in  Figs.  2.1a  and  2.1b 

DISTRIBUTION  A:  Approved  for  public  release 


35 


SINR 


number  of  independent  snapshots,  K 


Figure  2.2:  Normalized  SINR  versus  the  number  of  independent  snapshots  for  simulation  model. 
A^  =  20 


reveal  hence  that  EASTR  is  infact  “structurally”  the  closest  to  the  true  covariance  matrix. 


2.4.3  Normalized  SINR 

The  normalized  SINR  measure  [44]  is  commonly  used  in  the  radar  literature  and  is  given  by 

~  |s^R^iRR-is||s«R-is| 


where  s  is  the  spatio-temporal  steering  vector,  R^  is  the  data-dependent  estimate  of  R  at  the  i-th 
trial,  and  R  is  the  true  covariance  matrix.  It  is  easily  seen  that  0  <  77  <  1  and  77  =  1  if  and  only  if 
R  =  R.  The  SINR  is  plotted  in  dB,  that  is,  SINRi(dB)  =  101ogig77i.  Therefore,  SINRi(dB)  <  0. 

We  plot  the  normalized  average  SINR  versus  the  number  of  training  samples  K  in  Fig.  2.2.  In  this 
case,  we  consider  the  presense  of  wideband  jamming  J  =  3.  In  particular,  the  fractional  bandwidth 
Pi  =  [0.2, 0,0.3],  the  powers  and  phases  of  jammers  are  10  dB,  20  dB,  30  dB  and  20  deg,  40  deg, 
and  60  deg,  respectively.  When  K  <  N  the  sample  covariance  is  singular,  therefore  we  used  its 
pseudo-inverse  instead  of  inverse  itself.^  Interpreting  the  results  in  Fig.  2.2,  it  is  useful  to  start  with 
AML  which  does  particularly  well  when  training  is  generous  K  »  N .  However,  because  AML  is 
asymptotically  based  -  its  performance  is  poor  when  K  <  N  01  K  N  {K  in  the  vicinity  of  N  is 
often  considered  realistic  training).  Even,  the  SMI  and  SQP  estimators  are  better  than  AML  when 
training  is  low/realistic.  ITAM  is  effective  in  very  low  training  as  expected  because  its  exploits  both 
rank  and  Toeplitz  constraints  (though  in  a  largely  heuristic  way)  -  ITAM  does  not  exhibit  scalable 
improvements  as  training  support  is  increased.  EASTR  performs  the  best  overall,  even  better  than 
RCML  (which  was  recently  demonstrated  to  be  the  most  competitive  radar  STAP  estimator  [70])  by 

^Note  also  that  SMI  and  AML  have  a  dip  when  K  =  20  due  to  numerical  instabilities  in  the  K  =  N  training  regime. 
In  contrast,  ITAM,  RCML,  EASTR,  and  SQP  guarantee  nonsingularity  in  all  training  regimes. 
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Normalized  SINR  :  250  training  homogeneous  samples 


Normalized  SINR  :  250  training  homogeneous  samples 


(b) 


Normalized  SINR  :  352  training  homogeneous  samples 


(c) 


Normalized  SINR  :  352  training  homogeneous  samples 


(d) 


Normalized  SINR  :  450  training  homogeneous  samples 


(e) 


Normalized  SINR  :  450  training  homogeneous  samples 


(f) 


Figure  2.3:  Normalized  SINR  versus  azimuthal  angle  for  KASSPER  data  set.  (a)  and  (b)  for  K{ 
250)  <  N,  (c)  and  (d)  for  AT  =  iV  =  352,  and  (e)  and  (f)  for  K{=  450)  >  N 


DISTRIBUTION  A:  Approved  for  public  release 


37 


Normalized  SINR  :  250  training  homogeneous  samples 


(a) 


Normalized  SINR  :  250  training  homogeneous  samples 


(b) 


Normalized  SINR  :  352  training  homogeneous  samples 


Normalized  SINR  :  352  training  homogeneous  samples 


(d) 


Normalized  SINR  :  450  training  homogeneous  samples 


Normalized  SINR  :  450  training  homogeneous  samples 


(f) 


Figure  2.4:  Normalized  SINR  versus  Doppler  frequency  for  KASSPER  data  set.  (a)  and  (b)  for 
K{=  250)  <  iV,  (c)  and  (d)  ior  K  =  N  =  352,  and  (e)  and  (f)  for  K{=  450)  >  N 
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PD  vs.  SNR  in  NMF  test  (K=N=20) 


PD  vs.  SNR  in  NMF  test  (K=2N=40) 


(a) 


(b) 


Figure  2.5:  Probability  of  detection  vs.  SNR  for  simulation  model  via  normalized  matched  filter 
(NMF)  test,  [a)  K  =  N  =  20  and  (b)  K  =  2N  =  40. 


virtue  of  additionally  capturing  the  Toepltiz  structure  on  covariance. 

Fig.  2.3  and  Fig.  2.4  similarly  plots  the  normalized  SINK  result  for  KASSPER  data  set  as  a 
function  of  the  azimuthal  angle  and  Doppler  frequency  for  three  different  training  regimes.  Specifically, 
the  first  rows  of  Fig.  2.3  and  Fig.  2.4  are  corresponding  to  K  =  250(<  N),  the  second  rows  are 
corresponding  to  K  =  N  =  352,  and  finally  the  third  rows  are  corresponding  to  K  >  N  =  450 
training  samples.  Plots  in  the  right  column  show  zommed  in  versions  of  ITAM,  EASTR,  and  RCML. 
The  sample  covariance  technique  and  the  AML  suffer  tremendously  when  K  <  N.  For  low  training, 
ITAM  shows  comparable  performance  to  the  EASTR  and  the  RCML  estimators  in  some  ranges  of  the 
azimuthal  angle  but  is  worse  in  some  other  ranges.  On  an  average  (over  azimuthal  angle  and  Doppler 
frequency),  EASTR  is  easily  the  best  in  Fig.  2.3  and  Fig.  2.4,  even  providing  appreciably  gains  over 
the  second  best  RCML  estimator.  Further,  EASTR  is  stable  and  effective  across  all  training  regimes 
K  <  N,K  ^  N  and  K  >  N. 


2.4.4  Probability  of  Detection  vs.  SNR 


In  order  to  compute  probability  of  detection,  Pd,  we  apply  the  normalized  matched  filter  (NMF)  [75] 
as  the  test  statistic 


Is^R 


H 


C  Anmf 


(2.28) 


[s^R-is][x-'^R-ix]  Ho 

where  x  and  K  are  the  observation  vector  and  the  number  of  training  samples,  respectively.  The 
detection  probability  Pd  is  defined  as  the  probability  that  the  value  of  test  statistic  is  grater  than  a 
threshold  conditioned  on  the  hypothesis  that  the  received  data  includes  target  information.  Therefore, 
it  depends  on  signal  to  noise  ratio  (SNR,  by  virtue  of  s,)  and  the  estimated  covariance  matrix.  Since  Pd 
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1 


PD  vs.  SNR  in  NMF  test  (K=N=352) 


PD  vs.  SNR  in  NMF  test  (K=2N=704) 


(a)  (b) 


Figure  2.6:  Probability  of  detection  vs.  SNR  for  KASSPER  data  set  via  normalized  matched  filter 
(NMF)  test,  {a)  K  =  N  =  352  and  K  =  2N  =  704. 


does  not  typically  admit  a  closed  form,  we  first  generate  a  number  of  samples  from  the  true  covariance 
to  determine  A  corresponding  to  the  fixed  false  alarm  rate  and  then  employ  Monte  Carlo  simulations 
to  evaluate  Pd  corresponding  to  each  estimator.  We  set  a  constant  false  alarm  rate  to  10“'*. 

Fig.  2.5  shows  the  detection  probability  Pd  for  simulation  model  plotted  as  a  function  of  SNR 
for  different  estimators.  We  use  K  =  N  =  20  and  K  =  2N  =  40  training  samples  to  estimate  the 
covariance  matrix  in  Fig.  2.5a  and  Fig.  2.5b,  respectively.  It  is  well-known  that  K  =  2N  training 
samples  are  needed  to  keep  the  performance  within  3  dB.  Indeed,  we  see  that  the  sample  covariance 
matrix  has  about  3  dB  loss  vs.  the  true  covariance  matrix  in  Fig.  2.5b.  The  proposed  EASTR  is  the 
closest  to  the  Pd  achieved  by  using  the  true  covariance  matrix  (upper  bound)  for  both  cases.  In  Fig. 
2.5a,  we  do  not  plot  for  ITAM  and  SQP  because  they  do  not  guarantee  positive  semi-definiteness  of 
final  estimate  in  the  case  oi  K  =  N  =  20,  so  we  cannot  calculate  the  detection  probabilities  for  them. 

Fig.  2.6  also  shows  the  probability  of  detection  versus  SNR  plots.  We  use  the  same  training 
regimes  as  used  in  Section  2.4.3.  Fig.  2.6a  and  Fig.  2.6b  plot  results  for  K  =  352  and  K  =  2N  =  704, 
respectively.  We  can  see  similar  trends  in  Fig.  2.6  hence  for  KASSPER  data  to  the  ones  for  the 
simulation  model  in  Fig.  2.5.  EASTR  exhibits  the  best  performance  in  both  plots. 


2.5  Conclusions 

Our  work  focuses  on  jointly  exploiting  a  Toeplitz  structure  as  well  as  a  rank  constraint  on  the  clutter 
covariance  for  radar  STAP.  The  problem  is  inherently  hard  because  it  is  well  known  that  there  is 
no  closed  form  solution  for  ML  estimation  under  Toeplitz  constraint  for  all  training  regimes.  While 
past  work  has  provided  iterative  often  expensive  solutions,  we  develop  a  new  estimator  that  is  based 
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on  a  cascade  of  two  closed  forms.  The  first  closed  form  is  the  recently  proposed  RCML  estimator. 
Our  core  contribution,  the  second  step  of  Toeplitz  approximation  performs  constrained  optimization 
of  eigenvalues  to  either  exactly  or  approximately  satisfy  the  Toeplitz  constraint  without  compromis¬ 
ing  the  rank.  Crucially,  this  optimization  also  has  a  closed  form  making  the  overall  estimator  very 
friendly  from  a  computational  standpoint.  Via  performance  analysis  evaluating  probability  of  detec¬ 
tion,  normalized  SINK,  and  trace  deviation  measure,  our  estimator  is  shown  to  outperform  traditional 
efforts  in  Toeplitz  and  low  rank  covariance  estimation  including  those  based  on  expensive  numerical 
solutions.  Recently,  the  optimality  of  the  fast  maximum  likelihood  [24]  covariance  estimator  has  been 
proven  with  respect  to  cost  functions  involving  the  Frobenius  or  the  spectral  norm  [76].  EASTR  can 
also  be  investigated  for  similar  notions  of  optimality.  In  addition,  more  analysis  of  our  estimator  such 
as  asymptotic  convergence  can  be  performed.  Finally,  practical  evaluation  may  be  performed  on  other 
radar  data  sets  involving  departures  from  idealized  scenarios. 
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Chapter  3 


Robust  Covariance  Estimation 
under  Imperfect  Constraints  using 
Expected  Likelihood  Approach 

3.1  Introduction 

Radar  systems  using  multiple  antenna  elements  and  processing  multiple  pulses  are  widely  used  in 
modern  radar  signal  processing  since  it  helps  overcome  the  directivity  and  resolution  limits  of  a  single 
sensor.  Joint  adaptive  processing  in  the  spatial  and  temporal  domains  for  the  radar  systems,  called 
space  time  adaptive  processing  (STAR)  [47,  48,  44],  enables  to  suppress  interfering  signals  as  well  as 
to  preserve  gain  on  the  desired  signal.  Interference  statistics,  in  particular  the  covariance  matrix  of 
the  disturbance,  which  must  be  estimated  from  secondary  training  samples  in  practice  plays  a  critical 
role  on  success  of  STAR.  To  obtain  a  reliable  estimate  of  the  disturbance  covariance  matrix,  a  large 
number  of  homogeneous  training  samples  are  necessary.  This  gives  rise  to  a  compelling  challenge  for 
radar  STAR  because  such  generous  homogeneous  (target  free)  training  is  generally  not  available  in 
practice  [7]. 

Much  recent  research  for  radar  STAR  has  been  developed  to  overcome  this  practical  limitation 
of  generous  homogeneous  training.  Specifically,  the  knowledge-based  processing  which  uses  a  priori 
information  about  the  interference  environment  is  widely  referred  in  the  literature  [29,  40]  and  has 
merit  in  the  regime  of  limited  training  data.  These  techniques  include  intelligent  training  selection 
[29]  and  the  spatio-temporal  degrees  of  freedom  reduction  [40,  12,  6].  In  addition,  covariance  matrix 
estimation  techniques  the  enforce  and  exploit  a  particular  structure  have  been  pursed  as  one  approach 
of  these  techniques.  Examples  of  structure  include  persymmetry  [17],  Toeplitz  structure  [19,  18,  20], 
circulant  structure  [21],  and  eigenstructure  [24,  70,  25].  In  particular,  the  fast  maximum  likelihood 
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(FML)  method  [24]  which  enforces  a  special  eigenstructure  that  the  disturbance  covariance  matrix 
represents  a  scaled  identity  matrix  plus  a  rank  deficient  and  positive  semidefinite  clutter  component 
also  falls  in  this  category  and  is  shown  to  be  the  most  competitive  technique  experimentally. 

Recently,  the  works  by  Kang  et  al.  [70]  and  Aubry  et  al.  [25]  have  also  improved  upon  the  FML  by 
exploiting  practical  constraints  inspired  by  physical  radar  environment,  specifically  the  eigenstructure 
of  the  disturbance  covariance  matrix  for  radar  STAP.  They  employed  a  rank  of  the  clutter  subspace 
and  a  condition  number  of  the  interference  covariance  matrix  respectively  as  a  constraint  as  well  as 
the  structural  constraint  used  in  the  FML  into  the  optimization  problem.  For  both  methods,  though 
the  initial  optimization  problems  are  non-convex,  the  estimation  problems  are  reduced  to  a  convex 
optimization  problems  and  admit  closed-form  solutions.  Their  methods  have  also  been  shown  to 
enable  higher  normalized  SINK  over  the  state-of-the  art  alternatives  for  the  simulation  model  and  the 
knowledge- aided  sensor  signal  processing  and  expert  reasoning  (KASSPER)  data  set. 

In  [70] ,  the  authors  assume  the  rank  of  the  clutter  is  given  by  Brennan  rule  [1]  under  ideal  conditions 
of  no  coupling.  However,  in  practice  (under  non-ideal  conditions)  the  clutter  rank  departs  from  the 
Brennan  rule  prediction  due  to  antenna  errors  and  internal  clutter  motion.  In  this  case,  the  rank  is  not 
known  precisely  and  needs  to  be  determined  before  using  with  the  RCML  estimator.  Determination 
of  the  number  of  signals  in  a  measurement  record  is  a  classical  eigenvalue  problem,  which  has  received 
considerable  attention  in  the  past  60  years.  It  is  important  to  note  that  the  problem  does  not  have 
a  simple  and  unique  solution.  Consequently,  a  number  of  techniques  have  been  developed  to  address 
this  problem  [77,  78,  79,  80,  81].  In  addition,  the  noise  level  and  the  condition  number  should  be 
estimated  as  well  if  they  are  unknown  or  non  precisely  known  in  practice. 

Expected  likelihood  (EL)  approach  [82]  has  been  proposed  to  determine  a  regularization  parameter 
based  on  the  statistical  invariance  property  of  the  likelihood  ratio  (LR)  values.  More  specifically,  the 
probability  distribution  function  (pdf)  of  LR  values  for  the  true  covariance  matrix  depends  on  only 
the  number  of  training  samples  (K)  and  the  dimension  of  the  true  covariance  matrix  (N),  not  the 
true  covariance  itself  under  a  Gaussian  assumption  on  the  observations.  This  statistical  independence 
of  LR  values  on  the  true  covariance  itself  enables  pre-calculation  of  LR  values  even  though  the  true 
covariance  is  unknown.  Finally,  the  regularization  parameters  are  selected  so  that  the  LR  value  of  the 
estimate  agrees  as  closely  as  possible  with  the  median  LR  value  determined  via  its  pre-characterized 
pdf. 

Contributions:  In  view  of  the  aforementioned  observations,  we  develop  covariance  estimation 
methods  which  automatically  and  adaptively  determines  the  values  of  practical  constraints  via  an 
expected  likelihood  approach  for  practical  radar  STAP.  Our  main  contributions  are: 

•  A  method  of  choice  of  constraints  using  the  EL  approach:  We  propose  a  method  of  a 
choice  of  practical  constraints  employed  in  the  optimization  problems  for  covariance  estimation  in 
radar  STAP  using  the  expected  likelihood  approach.  The  proposed  method  guides  the  selection 
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of  the  constraints  via  the  expected  likelihood  criteria  in  the  case  that  the  knowledge  of  the 
constraints  is  imperfectly  known  in  practice.  We  consider  three  different  cases  of  the  constraints 
in  this  chapter:  1)  only  the  clutter  rank  constraint,  2)  both  the  clutter  rank  and  the  noise  power 
constraints,  and  3)  the  condition  number  constraint. 

•  Analytical  results  with  formal  proofs  for  three  different  cases  of  imperfect  con¬ 
straints:  For  each  case  mentioned  above,  we  develop  significant  analytical  results.  We  first 
formally  prove  that  the  rank  selection  problem  based  on  the  expected  likelihood  approach  has  a 
unique  solution.  This  guarantees  there  is  only  one  rank  which  is  the  best  (global  optimal)  rank 
in  the  sense  of  the  EL  approach.  Second,  we  derive  a  closed  form  solution  of  the  optimal  noise 
power  in  the  sense  of  the  EL  approach  for  a  given  rank.  This  means  we  do  not  need  iterative  or 
numerical  method  to  find  the  optimal  noise  power  and  enables  fast  implementation.  Einally,  we 
also  prove  there  exists  the  unique  condition  number  for  the  condition  number  selection  method 
via  the  EL  approach. 

•  Experimental  Results  through  simulated  model  and  the  KASSPER  data  set:  Ex¬ 
perimental  investigation  on  a  simulation  model  and  on  the  KASSPER  data  set  shows  that  the 
proposed  methods  for  three  different  cases  outperform  alternatives  such  as  the  EML,  leading 
rank  selection  methods  in  radar  literature  and  statistics,  and  the  ML  estimation  of  the  condi¬ 
tion  number  constraint  in  the  sense  of  normalized  SINK. 

The  rest  of  the  chapter  is  organized  as  follows.  We  provide  the  proposed  methods  of  the  constraint 
selection  problems  via  the  EL  approach  in  Section  3.2.  Experimental  validation  of  our  method  is 
provided  in  Section  3.3  wherein  we  report  the  performance  of  the  proposed  method  and  compare 
it  against  existing  methods  in  terms  of  normalized  SINK  on  both  the  simulation  model  and  the 
KASSPER  data  set. 


3.2  Constraints  selection  method  via  Expected  Likelihood  Ap¬ 
proach 

3.2.1  Imperfect  rank  constraint 

In  Chapter  1,  we  discuss  that  the  RCML  estimator  is  not  only  powerful  in  practice  but  also  compu¬ 
tationally  cheap  and  the  EL  approach  is  shown  to  be  useful  to  select  parameters  so  that  the  estimate 
is  consistent  with  the  true  covariance  matrix  in  the  sense  of  the  LR  value.  Erom  Eq.  (2.5),  we  see 
the  RCML  solution  is  a  function  of  the  rank  r  and  di’s  which  are  given  in  the  problem.  We  propose 
to  use  the  EL  approach  to  rehne  and  find  the  optimal  rank  when  the  rank  determined  by  underlying 
physics  is  not  necessarily  accurate. 
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12 


Figure  3.1:  ^log  LR  (Rrcml(?'),  Z)/ LRq  versus  r  for  KASSPER  dataset  {K  =  2N  =  704) 

Now  we  set  up  the  optimization  criterion  to  find  the  rank  via  the  EL  approach.  We  find  a  rank 
which  makes  its  corresponding  LR  value  closer  to  LRq  than  any  other  ranks.  That  is, 

Rrcml,,  =a2VA*-i(f)V"  (3.1) 


where 


f  =  arff  min 


LR  (Rrcml(^),  Z) 


—  LRo 


2 


(3.2) 


and  LR  (RRCML(r),  Z)  is  given  by  Eq.  (3.3). 

Now  we  investigate  the  optimization  problem  (3.2)  for  the  rank  selection.  Since  the  eigenvectors  of 

Rrcml  are  identical  to  those  of  the  sample  covariance  matrix  S,  the  LR  value  of  Rrcml  in  Eq-  (3.2) 

can  be  reduced  to  the  function  of  the  eigenvalues  of  Rrcml  and  S.  Let  the  eigenvalues  of  Rrcml 

and  S  be  Ai  and  di  (arranged  in  descending  order) .  Then  the  LR  value  of  Rrcml  can  be  simplified 

di 

to  a  function  of  ratio  of  to  Aj,  — .  That  is, 


LR  (Rrcml  (c),  Z) 


I  Rrcml  (QS I  exp  A 
exp  (tr 


N 

n 


A. 


•  exp 


exp 


[E 


A". 


(3.3) 


(3.4) 


Lemma  1.  The  LR  value  of  the  RCML  estimator,  LR  (Rrcml (c),  Z) ,  is  a  monotonically  increasing 
function  with  respect  to  the  rank  r  and  there  is  only  one  unique  f  in  the  optimization  problem  (3.2). 


Proof.  First,  let  r  be  the  largest  i  such  that  di+i  >  cr^.  Then,  from  the  closed  form  solution  of  the 
RCML  estimator,  the  eigenvalues  of  the  RCML  estimator  with  rank  i  and  i  +  1  for  given  i  <  r  will  be 

•  Rrcml(«)  :  di,  d2,  ...,  di,  cr^,  ...,  cr^ 
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•  Rrcml(*  +  1)  :  di,  d2,  di,  di+i,  cr^,  .. 


■2  a2 


since  di+i  >  Then  ^  should  be 


•  Hrcml(^)  :  1,  1, 


■  •  5 


(In 


(T^ 


^2  +  2 
^2  ’ 


div 


a 


•  R'RCMl(^  +  1)-  I5  •••5  lii  li+li 

From  Eq.  (3.4),  the  LR  values  of  the  RCML  estimators  with  the  ranks  i  and  i  -\-l  are 


LR(i)  = 


exp  N 


N 


5)  n 


1  ^ 

exp(z  +  ^  dk) 


k—i-^1 


N 


LR(i  +  1)  = 


exp  77  j 

^2iN-i-l)  11  “fc 

fc  =  2  +  2 


1  ^ 

exp(z  +  1  +  ^  dk) 


fc=2+2 


From  Eq.  (3.5)  and  Eq.  (3.6),  we  obtain 


N 


LR(i  +  1)  = 


exp  IV  -|-|- 

^2iN-i-l)  11  “fc 

fc=2  +  2 


N 


exp(z  +  1  +  ^  W  dk) 
fc=2+2 

exp  TV 

nj  11  dk-- 


fc  =  2+l 


h+1 


1  ^ 

exp(z  +  ^  y^dk)  exp(l  - 

(7^  ^ ^ 


A;=2+1 

di+i 


di+i . 
ct2 


=  LR(z)  •  •  exp(^^  -  1) 

tti+l 


Eq.  (3.9)  tells  us  LR(z+l)  can  be  calculated  by  multiplying  LR(z)  by  the  coefficient 

d, 

Fig.  3.2  shows  that 


(3.5) 


(3.6) 


(3.7) 


(3.8) 


(3.9) 


exp(^-l). 
k+l  cr^ 


exp(^^  -  1)  >  1 

rr^ 


^2+1 


(3.10) 


^2 

for  all  values  of  - - .  Therefore,  it  is  obvious  that 

«i+l 


LR(t  +  1)  >  LR(t), 


(3.11) 


which  means  the  LR  value  monotonically  increases  with  respect  to  z. 
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Now,  let’s  consider  the  other  case,  i>  r.  In  this  case,  since  di+i  <  cr^,  it  is  easily  shown  that 


R.rcml(*)  =  R-rcml(*  +  1) 


(3.12) 


Therefore, 

LR(i  +  1)  =  LR(i)  (3.13) 

This  proves  that  LR(i)  monotonically  increases  for  all  1  <  i 

□ 

Lemma  1  gives  us  a  significant  analytical  result  that  is  the  EL  approach  leads  to  a  unique  value 
of  the  rank,  i.e.,  when  searching  over  the  various  values  of  the  rank  it  is  impossible  to  come  up  with 
multiple  choices.  That  also  means  that  it  is  guaranteed  that  we  can  always  find  the  global  optimum 
of  r  not  local  optima  (minima)  for  the  optimization  problem  (3.2)  regardless  of  an  initial  value  of  r. 
We  plot  the  values  of  ^  log  ^  LR  (Rrcml  (?') ,  Z)  /  LRq  ^  ^  versus  the  rank  r  for  one  realization  for  the 
KASSPER  dataset  (AT  =  2N  =  704)  in  Fig.  3.1.  Since  the  LR  values  are  too  small  in  this  case,  we 
use  a  log  scale  and  the  ratio  between  two  instead  of  the  distance  to  see  the  variation  clearly.  Note 
that  monotonic  increase  of  the  value  of  LR  (Rrcml  (?'),  Z)  w.r.t  r  guarantees  a  unique  optimal  rank 
even  if  the  optimization  function  as  defined  in  (3.2)  is  not  necessarily  convex  in  r. 

The  algorithm  to  find  the  optimal  rank  is  simple  and  not  computationally  expensive  due  to  the 
analytical  results  above.  For  a  given  initial  rank  such  as  Brennan  rule  for  the  KASSPER  data  set  and 
the  number  of  jammers  for  a  simulation  model,  we  first  determine  a  direction  of  searching  and  then 
find  the  optimal  rank.  The  procedure  of  finding  the  optimal  rank  is  shown  in  Algorithm  1  in  detail. 
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Algorithm  1  The  proposed  algorithm  to  select  the  rank  via  EL 
1;  Initialize  the  rank  r  by  physical  environment  such  as  Brennan  rule. 

2:  Evaluate  LR(r  —  1),  LR(r),  LR(r  +  1)),  the  LR  values  of  RCML  estimators  for  the  ranks  r  —  1,  r, 
r  +  1,  respectively. 

•  if  I  LR(r  +  1)  -  LRo  I  <  I  LR(r)  -  LRq  | 

— >  increase  r  by  1  until  |  LR(r)  —  LRo  |  is  minimized  to  find  f. 

•  elseif  I  LR(r  -  1)  -  LRq  |  <  |  LR(r)  -  LRq  | 

— >  decrease  r  by  1  until  |  LR(r)  —  LRg  |  is  minimized  to  find  f. 

•  else  f  =  r,  the  initial  rank. 


3.2.2  Imperfect  rank  and  noise  power  constraints 

In  this  section,  we  investigate  the  second  case  that  both  the  rank  r  and  the  noise  power  cr^  are  not 
perfectly  known.  We  propose  the  estimation  of  both  the  rank  and  the  noise  level  based  on  the  EL 
approach.  The  estimator  with  both  the  rank  and  the  noise  power  obtained  by  the  EL  approach  is 
given  by 

Rrcmlkl  (3.14) 


where 


(f,  (7^)  =  arg  min  LR  (RRCML(r,  cr^),  Z)  -  LRq 

reZ,(T2>0 


In  section  3.2.1,  we  have  shown  that  the  optimal  rank  via  the  EL  approach  is  uniquely  obtained 
for  a  hxed  Now  we  analyze  the  LR  values  of  the  RCML  estimator  for  various  and  a  fixed  rank. 

Lemma  2.  For  a  fixed  rank,  the  LR  value  of  the  RCML  estimator,  which  is  a  function  of  a'^,  has  a 
maximum  value  at  R  monotonically  inereases  for  and  monotonically  decreases 

for  0-2  > 

Proof.  In  this  section,  I  investigate  the  LR  values  for  varying  noise  level  cr^  and  a  given  rank  r.  From 
Eq.  (3.5)  we  obtain  the  LR  value  when  the  rank  is  r. 


LR(a2)  = 


exp  A  -p-p  ^ 
fj2{N-r}  li 

i  ^ 

exp(r  +  —  ^  dk) 


(3.16) 


For  simplicity,  let  =  t  then  Eq.  (3.16)  can  be  simplihed  as 


LR(t)  = 


n  dk 


Ylk^r+l  dk 

i  t 
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N 


N 


Now  let 


H  dk  =  ds  and  []  dk 

k—r-\-l  k—r-\-l 


=  dp,  then 


LR(t) 


e^-^dp 


(3.18) 

(3.19) 


To  analyze  increasing  or  decreasing  property  Eq.  (3.19),  I  calculate  its  first  derivative.  Since  dpC^  ’’ 
is  a  positive  constant,  it  does  not  affect  increasing  or  decreasing  of  the  function.  Therefore, 


e  ‘  ) 

(r  - 

(3.20) 

(r  - 

(3.21) 

t'-^-2((r-iV)t  +  4)e-‘^=/* 

(3.22) 

Since  f  ^  and  e  are  always  positive,  the  first  derivative  {t 


0  if  and  only  if 


t  = 


J2k=r+1  dk 

N  —  r 


(3.23) 


A/^  (i 

and  it  is  positive  when  t  <  — —  and  negative  otherwise.  This  means  that  LR(f7^)  increases 


for  < 


EN  7 

k—r-\-l  j  j  f  2  2^k— 

and  decreases  tor  > 


N  —  r 


_Li  dk 

- - .  The  LR  value  is  maximized  when  = 


k—r-\-l 


dk 


N  —  r 

.  * _ _  is  the  average  value  of  N  —  r  smallest  eigenvalues  of  the  sample 

N  —  r  N  —  r 

covariance  matrix  and  in  fact  a  maximum  likelihood  solution  of  as  shown  in  the  RCML  estimator 


AT  7 

l^k=r+l  k  l^k=. 


[70]. 


□ 


Fig.  3.3  shows  an  example  of  the  LR  values  as  a  function  of  the  noise  level  As  shown  in  Lemma 
2,  we  see  that  the  LR  value  is  maximized  for  the  ML  solution  of  and  monotonically  increases  and 
decreases  for  each  direction.  It  is  obvious  that  we  have  three  cases  of  the  solution  of  the  optimal  noise 
power  from  Lemma  2:  1)  no  solution,  2)  only  one  solution,  and  3)  two  optimal  solution.  Now  we 
discuss  how  to  obtain  the  optimal  noise  power  for  a  fixed  rank. 


Lemma  3. 


The  noise  power  obtained  by  the  expected  likelihood  approach,  is  given  by 


^el  =  exp 


(3.24) 
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where  Wk{z)  is  the  k-th  branch  of  Lambert  W  function  and 


a  =  r  —  N 

b  =  Etr+idk  (3.25) 

c  =  log  LRo  -  log  rifcr+i  dkj  +  a 

Proof.  For  a  given  rank  r,  the  optimal  solution  of  the  noise  power  via  the  EL  approach,  t{=  cr'^if),  is 
the  solution  of  LR(f)  =  LRq.  From  Eq.  (3.19),  that  is,  t  is  the  solution  of  the  equation  given  by 

dpe^-^p-^e-"^  =LRo  (3.26) 


Taking  log  on  both  side  leads 


logdp  +  At  -  r  +  (r  -  At)logt - ^  =  log  LRq 


(3.27) 


For  simplification,  we  take  substitutions  of  variables, 


a  =  r  —  N 

b  =  Ek=r+1  dk 

c  =  log  LRo  -  log  n^r+i  dk 


(3.28) 


Then,  Eq.  (3.27)  is  simplified  to  an  equation  of  t. 


a  log  t  —  -  =  c 


(3.29) 


Again,  let  u  =  logt.  Then,  since  t  =  e“,  we  obtain 


,  -  6e-“  = 


®  ~b'^~b 


(3.30) 

(3.31) 


Now  let  s  =  u  —  Then,  the  equation  is 


€■  ^  —  —  s 

b 


se  =  —e  “ 
a 


(3.32) 

(3.33) 


The  solution  of  Eq.  (3.33)  is  known  to  be  obtained  using  Lambert  W  function  [83].  That  is. 


s  =  W\  -e  “ 
a 


(3.34) 
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Figure  3.3:  The  LR  value  versus  tr^  for  the  simulation  model,  N  =  20,  K  =  40,  r  =  5 
where  W{-)  is  a  Lambert  W  function  which  is  defined  to  be  the  function  satisfying 

=  z  (3.35) 


Finally,  we  obtain 


and 


u  =  W 


c 


^EL  =  i  =  exp 


(3.36) 


(3.37) 

□ 


Lemma  3  shows  that  there  is  a  closed  form  solution  of  the  optimal  noise  power  for  a  fixed  rank. 
Therefore  we  do  not  need  any  iterative  and  numerical  algorithms  to  obtain  both  the  optimal  rank  and 
noise  power. 

Now  we  propose  the  method  to  alternately  find  the  optimal  solution  of  both  the  rank  and  the 
noise  power.  For  a  fixed  tr^,  we  can  obtain  the  optimal  rank  via  Algorithm  1.  For  a  hxed  rank,  we 
should  consider  three  cases  described  above.  The  first  case  implies  that  the  LR  value  corresponding 
(T^l  is  less  than  LRq  and  therefore,  we  increase  the  rank  until  the  solution  of  cr^  exists.  In  the  second 
case,  we  can  easily  determine  =  ct^l.  For  the  third  case  that  there  are  two  solutions  of  a^,  we 
have  to  choose  one  among  two  EL  solutions  and  the  ML  solution.  We  experimentally  observe  that  the 
threshold  in  the  test  statistics  such  as  the  normalized  matched  filter  is  typically  smaller  for  the  better 
estimator  in  the  sense  of  the  normalized  SINR  and  the  probability  of  detection  from  our  experiments. 
Therefore,  we  choose  one  of  which  generates  the  smallest  value  of  the  test  statistics. 

The  detail  procedure  of  the  algorithm  is  described  in  Algorithm  2. 
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Algorithm  2  The  proposed  algorithm  to  select  the  rank  and  the  noise  level  via  EL 
1;  Initialize  the  rank  r  by  physical  environment  such  as  Brennan  rule  or  the  number  of  jammers. 

2:  If  there  is  no  solution  of  for  given  r,  increase  r  until  the  solution  of  exists. 

3:  Obtain  ct^l  =  ivb  J2Zr+i  di- 

4:  For  given  CTmlj  ^  ^  using  Algorithm  1. 

5:  Repeat  Step  3  and  Step  4  until  the  rank  r  converges. 

6;  After  r  is  determined,  choose  <7^  among  cf‘^i^2- 

3.2.3  Imperfect  condition  number  constraint 

Now  we  discuss  the  proposed  method  to  determine  the  condition  number  constraint  through  the 
EL  approach  in  this  section.  The  condition  number  constrained  ML  estimator  is  a  function  of  the 
condition  number  Amax-  Therefore,  the  hnal  estimate  is  also  a  function  of  Amax-  Similar  to  what 
we  have  done  in  previous  sections,  we  find  an  optimal  condition  number  so  that  the  LR  value  of  the 
estimated  covariance  matrix  should  be  same  as  a  statistical  median  value  of  the  LR  value  of  the  true 
covariance  matrix,  that  is 

Rcncmlbl  =<7"VA*-'(A„,ax)V^  (3.38) 

where 

2 

-^max  =  S-rg  mill  LR,  (R-CNCML(I^max)5  LRq  (3.39) 

•tNmax J- 

Before  we  discuss  the  algorithm  to  hnd  the  optimal  condition  number,  we  analyze  the  closed  form 
solution  for  the  condition  number  constrained  ML  estimation  which  is  proposed  in  [25].  We  derive  a 
more  explicit  closed  form  solution. 

Lemma  4.  The  more  simplified  closed  form  solution  of  the  condition  number  constrained  ML  esti¬ 
mator  is  given  by 

1.  di  <  cr^, 

Ron  =  (3.40) 

2.  <  di  <  Cr^Amax, 

Rcw  =  Rfml  (3-41) 

3.  d\  ^  (7  Ajaax  and  A^ax  ^  jy  ? 

Rcn  =  $  diag(A*)$^  (3.42) 

where 

A  =  [^(J  Ajj^axj  .  ■  .  ,  CT  Krnaxi  :  djg  j  (J  ,  .  .  .  ,  (J  j  ,  (3.43) 

c  and  N  are  the  vector  of  the  eigenvalues  of  the  estimate,  the  largest  indices  so  that  d^  >  cr'^K-max, 
and  djv  > 
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4-  di  ^  (J  -f^max  (1‘fld  -/^max  ^ 


ELi  - 


^  rl 

A  -  L^,---,  ^  ,dp+l,...,d,, 


uK„ 


And  the  condition  numbers  of  the  estimates  are  1,  K^ax;  and  Kmax,  respectively. 
Proof.  We  consider  5  cases  provided  in  [25]. 


ii  ^  a  ^ 

Since  u*  = 


Therefore, 


X*  =  min(niin(KmaxU*,l),max(u*,  3^)) 

di 

=  min(niin(l,  1),  max(— - ,  ^)) 

^max  di 

=  min(l,i)  =  l 
di 


RcN  = 


and  the  condition  number  is  1. 


2.  <  di  <  Km; 

Since  u*  =  f-, 

di  ’ 


A*  = 


min(min(KmaxU*,l),max(u*,  ^)) 

di 

.  /  .  /  Kmax  \  /I  1  \  \ 

inm(min(— l),inax(3-,  3-)) 
ui  di  di 

/  1  , 

inm(l,  -j) 
di 

J~  ^  1 

1  J,-  <  1 


Therefore, 


and  the  condition  number  is 


Rcn  =  R-fml 


3.  di  >  0-2  Kmax  and  u*  = 

\u=d-  must  be  zero  if  u*  =  The  first  derivative  of  Gi{u)  is  given  by 


0  if  —  <  t6  <  1 


DISTRIBUTION  A:  Approved  for  public  release 


(3.44) 


(3.45) 

(3.46) 

(3.47) 

(3.48) 


(3.49) 

(3.50) 

(3.51) 

(3.52) 

(3.53) 


(3.54) 
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for  di  <  1,  and 


Ki];iax  di  if  0  <  "u  ^ 


G^{u)  = 


0 

~h  di 


Kmax  di 


if  T,  ^  T  <  u  <  j- 

if  i  <  u  <  1 


(3.55) 


for  di  >  1.  Therefore, 


dG{u) 

du 


N 


N 


i—J-  —  ^  ^  (Kmax  di  di)  T  ^  ^  (Kmax  dj  di) 


(3.56) 


i=N+l 


where  p  is  the  greatest  index  such  that  ~  ^  t  .  For  i  =  N, . . .  ,N,  since  di  <  1, 


Kmax  di  di  <  Kmax  di  <  0 


(3.57) 


and  for  i  =  p, N  —  1,  since  di  >  K^ax  di,  K^ax  di  —  di  <  0. 
obvious  that 


dG{u) 

du 


<  0 


Therefore,  in  this  case,  it  is 
(3.58) 


4.  di  >  cr^  Kmax  and  u*  = 

Aubry  et  al.  [25]  showed  that  u*  =  if  L-  i  <  0.  From  Eq.  (3.54)  and  Eq.  (3.55), 

max  Kmax 


dG{u) 

du 


N 

2  =  iV+l 


P 

1)  +  ~  Kmax) 

z=l 


(3.59) 


where  p  is  the  greatest  index  such  that  dp  >  Kmax-  Therefore, 


^L=^<0  (3.60) 


N  p 


<(4> 

Kniax(di  -  1)  +  y^(d»  -  Kmax)  <  0 

(3.61) 

i=fV+l  i=l 

<(4> 

Kmax(E,yiV+l('^*  -  1)  -  P)  +  ELl  d*  <  0 

(3.62) 

<(4> 

Kmax(Etrv+i(d‘*  -1)-P)<-  ELl  d" 

(3.63) 

<(4> 

max  ^  ,v+l(dd-l) 

(3.64) 
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In  this  case, 


min(min(KmaxU*,l),max(u*,  ^)) 

di 

min(min(l,  1),  max(— - ,  ^)) 

K-niax  di 

min(l,max(-^,  i)) 


nin(l,  j^)  d,  >  K„ 

^ '^max 

min(l,i)  di  <  K„ 

di  ^  Kjjiax 
^  1  ^  di  <  Kmax 

1  d*  <  1 


Finally  we  obtain 


A  —  (J  dCjaiax  j  ■  •  ■  5  -^max :  dpj^.  .  .  .  ^  d]\f  ^  CT  ,...,(7  , 


where  p  and  N  are  the  largest  indices  so  that  dp  >  and  dff  >  respectively. 

5.  di  >  CT  dTjaax  and  Kmax  ^  ^  J. 

In  this  case,  since  i  <  u*  <  ^  ^  , 


A*  =  min(niin(KmaxU*,  l),max(u*,  ^)) 

di 


=  min(KmaxU*,max(M*,  ^)) 

di 


min(KmaxM*,U*)  di  >  ;^ 

min(KmaxM*,  Jt)  di  <  ^ 

U* 

x  K — ^ — r  <  di  <  ^ 

di  Kmax  ^  ^ 

7/*  fj  ■  <;^  _ 3^ _ 

-LA-max  ^  ^  x^'  ...* 


(3.73) 


Therefore,  we  obtain 


A  —  [  ,  (ip+i, . . . ,  (iq, 

u* 


^  ^  77* /t" 

a. max  77 


where  p  and  g  are  the  largest  indices  so  that  dp  >  —  and  dq  > 


respectively. 


From  Lemma  4,  for  the  first  two  cases  that  is  di  <  cr^K^a^,  the  estimator  is  either  a  scaled  identity 
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matrix  or  the  FML.  Therefore,  there  is  no  need  to  find  an  optimal  condition  number  in  these  cases 
since  the  estimator  is  not  a  function  of  the  condition  number. 

Now  we  investigate  uniqueness  of  the  optimal  condition  number  as  we  have  done  in  the  case  of 
only  rank  constraint  for  the  last  two  cases  where  the  optimal  eigenvalues  are  functions  of  the  condition 
number. 

Lemma  5.  The  LR  value  of  the  condition  number  ML  estimator  is  a  monotonically  increasing  function 
with  respect  to  the  condition  number  Xmax  and  there  is  only  one  unique  • 

Proof.  1.  di  <  cr^ 

Rcn  =  (3.76) 

In  this  case,  R-cn  does  not  change,  so  LR(K„iax)  is  a  constant. 

2.  <  di  <  Cr^RTmax 

Rcn  =  Rfml  (3.77) 

In  this  case,  Rcn  does  not  change,  so  LR(Ki„ax)  is  a  constant. 

3.  di  >  fJ  iFaiax  and  Kma-X  ^ 

Rcn  =  ^diag(A*)$^  (3.78) 


where 


A  —  (J  dfjaiax  j  ■  •  ■  5  TFjaax :  dpj^  l,...,d^,(7  ,...,(7  , 


(3.79) 


p  and  N  are  the  largest  indices  so  that  dp  >  cr'^K^a,^  and  djg  >  respectively. 


LR(Ki„ax) 


nti 

exp(Ef=i 

p  ,  N  N  ^ 

Cti  TT  .  TT 


n 


K, 


n  1  n 


-  1  “  ^'.max  ■  I  1  ^ 

j=l  t=p+l  i-N+l 


1  .  pN 

2  ^ 


exp(X] 


N 


N 


E  1+  E  S) 


•  1  -  --max  .  ,  -  -  ^ 

z=l  z=p+l  z=iV+l 


nL 


P  di 


nr= 


^  _  di 


i=N+l 


•  e 


p  j  _  ^  n 

■  1  o  rvaiax  o 

^=^  i=N+l 
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(a)  within  the  range  where  p  remains  same 


LR(Kmax) 


UL 


1  =  1  <t2  K„ 


lli=Ar+l  cr2 


exp(X] 


c^-p 


TP 


=  Cl 


=  Cl 


nr=w 


-p(i:Lr 


(<t2  K„„)p 


p 


=  Ci- 


n=l 


((t2  K„ 


1  TTP 


(exp(Er=iC«0)^ 


=  C2 


i  1  )P 


N 


i=iV+i 


=  C2- 


(Kniax)^  ‘  C3 


nf 


.iV  +  l 


,  C2  =  Cl  ,  and  C3  =  exp(^  X;r=i  c^*)- 


where  Cl  =  -  „ 

exp(Ar-p)-exp(^._^^^ 

Now  let’s  evaluate  the  first  derivative  of  the  denominator  of  Eq.  (3.80). 


((K„,axF  •  C3^"“  )' 

=  p(K„,ax)'’-^C3^  +  (Kmax)'’^^7^E°^ 

V^max  ) 

=  p(K„,ax)^”^cJ^  -  (K„,ax)^”^C3^  log  C3 

=  (Kmax)^“^C;^'"“  {p  Kmax  “  log  C3) 

1  1  P 

=  (K„iax)^”^C3"'”“  (pK„,ax 

i^l 

Since  di  >  d2  >■■■>  dp  >  a'^  Kmax, 

1  ^ 

pKinax-^  <  0  (3.80) 

i—1 

This  implies  the  denominator  of  Eq.  (3.80)  is  a  decreasing  function,  and  therefore,  Li?(Kmax) 
is  a  increasing  function  with  respect  to  K^ax- 

(b)  p  — )■  p  +  1  as  Kmax  decreases 

The  LR(Kniax)  is  a  continuous  function  since  Ap+i  =  dp+i  at  the  moment  that  cr^  Kmax  = 
dp+i  and  there  is  no  discontinuity  of  A^.  Therefore,  LR(Kiaax)  is  an  increasing  function  in 
this  case. 
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4. 


>  (T  -f^max  find  Kyl 


< 


E- 


di 

(d.-l) 


Rcn  =  $diag(A*)$'^ 


(3.81) 


where 


A*  = 


■p+l)  ■  ■  ■  :  Uqt, 


(3.82) 


-  2 
p,  g,  and  N  are  the  vector  of  the  eigenvalues  of  the  estimate,  the  largest  indices  so  that  dp  >  ^, 

2  Yy 

dn  >  TTw — j  and  >  o  .  respectively. 

Before  we  prove  the  increasing  property  of  LR(Kiiiax))  we  show  u  decreases  as  K^ax  increases. 
u  is  the  optimal  solution  of  the  optimization  problem.  In  this  case,  u*  is  obtained  by  making 
the  first  derivative  of  the  cost  function  0.  Let  ui  and  U2  be  the  optimal  solutions  for  Kmaxi  and 
Kmax2>  respectively.  Then,  y'Jl,  G'Jui)  =  0  for  Kmaxi-  Since  ^  <  Ui  <  —  in  this  case, 

for  Kniax2  <  Kmaxi,  the  value  of  G[{ui)  decreases  for  di  <  1.  G[{u)  also  decreases  for  di  >  1 
and  u  <  ^  .  and  remain  same  for  di  >  1  and  ^  ■  <  u.  Therefore,  70,^1  G'Aui)  <  0  for 
Kmax2-  Finally,  since  'Yl!i=\G'i(u2)  must  be  zero  for  Kmax2j  it  is  obvious  that  ui  <  U2-  This 
shows  that  u  decreases  as  K^ax  increases. 


Now  we  show  the  increasing  property  of  LR(Ki„ax)- 


(a)  within  the  range  where  p  and  q  remain  same 

In  this  case.  We  show  LR('u)  is  a  decreasing  function  of  u  and  an  increasing  function  of 
Kmax  for  each  of  u  and  Kmax- 

i.  Proof  of  LR(t6)  is  a  decreasing  function. 


LR(r 


np  udj  TT^ 

i=l  cr2  ■  IL  = 


K„ 


-  udi 


r.N 


^  +  1 


exp(ELi^  +  ELp+ii 


+  Ef=, 


c  ud, 


^+1 


'UL 


p  A  .  yN-q  T-riV 


N  K^ax 


oiV 


exp(u(ELi^+Et,+i^ 


+q-p) 

exp(c2'u  +  C3) 

U^-q+P 


where  ci 


9+1 


olV 


„  _  Y^P  d±  I 

O2  2-^i—l  '  2^i—< 


9+1 


C3  =  q-p, 
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C4  =  and  C5  =  The  first  derivative  of  Eq.  (3.83)  is  obtained  by 


LR'(u) 

=  {N-q+  p)u^-«+P-ic5  “ 

-U^-«+Pl0gC5-C^“ 


_  .,N-q+p-l  - 


=  U 


-q+p-u\og  C5) 


=  ^c^'^{N  —  q  +  p  —  C2u) 


=  U 


C-^(N-q+p 

\  '  -^max 


-(E3+  E 


)) 


i—q-\-l 


Since  ^  <  dp, 


N-q  +  p-uiJ2;^+  E 


\  '  ^max  ^ 


i—1  i—q-\-l 

V  N  —  Q 

<  N-q  +  p-u{--\ - Kmax) 

u  u 

=  N-q-K^,^iN-q) 


Since  Kmax  >  1,  LR^(m)  <  0  which  implies  LR(m)  is  a  decreasing  function  with  respect 
to  u. 

ii.  Proof  of  LR(Kniax)  is  an  increasing  function. 


LR(Kmax) 

nP  udj  TT^ 

_  i  =  l  0-2  'lli^g+l 

exp(ELi^  +  EL,+ii 


Kmax  udj  N 

^2  fc- 


N 

2^i—< 


Krnax  Udi 


i—q-\-l 


) 


Cl  Kn 


N-q 


exp(c2  K  max  +C3) 


K, 


N-q 


=  C4- 


whereci  =  nLi’^-nf=, 


g+l  0-2 


„  —  ydd,  r  —  S^P  yd±4.n—r,  r,  —  °1 

^2  2^i=q+l  0-2  5  ‘"3  2-ii=l  0-2  +9  Pj  C4  gC3  , 
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and  C5  =  .  The  first  derivative  is 


LR'(K^ax) 


-  K,„ax^”«logC5-C^^““ 

(3.83) 

—  K  N-q-l 

—  A^max 

X  {N  -q-  Kinax  log  C5) 

(3.84) 

_  TV  N-q+p-1 

—  A^max 

^  ^5  (-^  ^  ^2  K^iax) 

(3.85) 

_  IC  N-q-\-p-l 

—  A^max 

^  „r] 

Xcr(iV-g-K„ax  5] 

i=q+l 

(3.86) 

f,  S 

N  —  Q 

>  N  q  Ki„ax(  _  )  =  0 

(3.87) 

Rmax 


Therefore,  LR'(Ki„ax)  >  0  and  LR(Ki„ax)  is  an  increasing  function  with  respect  to 
K 

^xinax- 

These  two  proofs  show  that  LR(w,Kmax)  is  an  increasing  function  with  respect  to  Kmax- 
(b)  p  and  q  changes  as  Kmax  decreases 

The  LR(u,  Kmax)  is  a  continuous  function,  and  therefore,  LR(t6,  Kmax)  is  an  increasing 
function  in  this  case. 

□ 


Lemma  5  formally  proves  that  the  there  exist  only  one  optimal  condition  number  and  therefore  we 
can  find  the  optimal  condition  number  numerically.  The  algorithm  of  finding  the  optimal  condition 
number  is  shown  in  Algorithm  3.  We  first  set  the  initial  condition  number  as  the  ML  condition  number 
obtained  by  [25].  Then  we  increase  or  decrease  the  condition  number  to  the  direction  where  the  LR 
value  decreases.  Reducing  the  stepsize  as  the  direction  is  reversed,  we  find  the  optimal  condition 
number  as  precisely  as  we  want. 
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Algorithm  3  The  proposed  algorithm  to  select  condition  number  via  EL 
1;  Obtain  the  ML  solution  of  the  condition  number  AmaxML  t>y  the  method  in  [25]  and  set  the  initial 
value  of  Ajaiax  AjaaXML 
2:  Set  the  initial  step,  A  =  Amax/lOO 
3:  Evaluate  LR(Ai„ax  -  A),  LR(Ai„ax),  LR(Amax  +  A) 

•  if  I  LR(AniaxML  +  A)  —  LRq  I  <  I  LR(Ai„aXML)  ~  LRq  I 
— >  increase  Amax  by  A  until  it  does  not  hold. 

— >  then  A  =  —A/10 

•  elseif  I  LR(AmaxML  +  A)  -  LRq  |  >  |  LR(AmaxML)  -  LRo  | 

— >  decrease  A^ax  by  A  until  it  does  not  hold. 

— >  then  A  =  —A/10 

4;  Repeat  Step  3  until  A  <  0.0001. 

3.3  Experimental  Validation 

3.3.1  Experimental  setup 

In  this  section,  we  compare  the  proposed  methods  with  alternative  covariance  estimation  algorithms 
and  parameter  estimation  algorithms.  Two  data  sets  are  used  in  the  experiments:  1)  a  radar  covariance 
simulation  model  and  2)  the  KASSPER  dataset  [8]. 

First,  we  consider  a  radar  system  with  an  A-element  uniform  linear  array  for  the  simulation  model. 
The  overall  covariance  which  is  composed  of  jammer  and  additive  white  noise  can  be  modeled  by 

,7 

R(n,  m)  =  ^  ct/  sinc[O.5,0i(n  —  +  a"^5{n,  m)  (3.88) 

where  n,m  £  {1, . . . ,  N},  J  is  the  number  of  jammers,  af  is  the  power  associated  with  the  tth  jammer, 
(pi  is  the  jammer  phase  angle  with  respect  to  the  antenna  phase  center,  Pi  is  the  fractional  bandwidth, 
cr^  is  the  actual  power  level  of  the  white  disturbance  term,  and  6(n,Tn)  has  the  value  of  1  only  when 
n  =  m  and  0  otherwise.  This  simulation  model  has  been  widely  and  very  successfully  used  in  previous 
literature  [24,  25,  74,  33]  for  performance  analysis. 

Data  from  the  L-band  data  set  of  KASSPER  program  is  the  other  data  set  used  in  our  experiments. 
Note  that  the  KASSPER  data  set  exhibits  two  desirable  characteristics:  1)  the  low-rank  structure  of 
clutter  and  2)  the  true  covariance  matrices  for  each  range  bin  have  been  made  available.  These  two 
characteristics  facilitate  comparisons  via  powerful  hgures  of  merit.  The  L-band  data  set  consists  of  a 
data  cube  of  1000  range  bins  corresponding  to  the  returns  from  a  single  coherent  processing  interval 
from  11  channels  and  32  pulses.  Therefore,  the  dimension  of  observations  (or  the  spatio-temporal 
product)  A  is  11  X  32  =  352.  Other  key  parameters  are  detailed  in  Table  2.1. 

We  measure  the  normalized  signal  to  interference  and  noise  ratio  (SINR).  The  normalized  SINR 
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measure  is  commonly  used  in  the  radar  literature  and  given  by 


|s^R 

|s^R-iRR-is||s«R-is| 


(3.89) 


where  s  is  the  spatio-temporal  steering  vector,  R  is  the  data-dependent  estimate  of  R,  and  R  is  the 
true  covariance  matrix.  It  is  easily  seen  that  0  <  rj  <  1  and  ry  =  1  if  and  only  if  R  =  R.  The  SINK  is 
plotted  in  decibels  in  all  our  experiments,  that  is,  SINR(dB)  =  lOlog^gry.  Therefore,  SINR(dB)  <  0. 
For  the  KASSPER  data  set,  since  the  steering  vector  is  a  function  of  both  azimuthal  angle  and  Doppler 
frequency,  we  obtain  plots  as  a  function  of  one  variable  (azimuthal  angle  or  Doppler)  by  marginalizing 
over  the  other  variable.  We  evaluate  and  compare  different  covariance  estimation  techniques  and 
parameter  selection  methods  in  the  following  three  subsections: 

•  Sample  Covariance  Matrix:  The  sample  covariance  matrix  is  given  by  S  =  j^ZZ^ .  It  is 
well  known  that  S  is  the  unconstrained  ML  estimator  under  Gaussian  disturbance  statistics.  We 
refer  to  this  as  SMI. 


•  Fast  Maximum  Likelihood:  The  fast  maximum  likelihood  (FML)  [24]  uses  the  structural 
constraint  of  the  covariance  matrix.  The  FML  method  just  involves  the  eigenvalue  decomposition 
of  the  sample  covariance  and  perturbing  eigenvalues  to  conform  to  the  structure.  The  FML  also 
can  be  considered  as  the  RCML  estimator  with  the  rank  which  is  the  greatest  index  i  satisfying 
Xi  >  cr^  where  Ai’s  are  the  eigenvalues  of  the  sample  covariance  in  descending  order.  Therefore, 
a  rank  can  be  considered  as  an  output  of  the  FML.  The  FML’s  success  in  radar  STAP  is  widely 
known  [11]. 

•  Rank  Constrained  ML  Estimators:  The  RCML  estimator  with  the  rank  or  the  rank  and 
the  noise  level  obtained  by  the  proposed  methods  using  the  expected  likelihood  approach.  The 
rank  is  obtained  by  the  EL  approach  in  the  case  of  the  imperfect  rank  constraint  and  both  of 
the  rank  and  the  noise  level  are  obtained  by  the  EL  approach  in  the  case  of  imperfect  rank  and 
noise  power  constraints.  We  refer  to  these  as  RCMLel- 

•  Chen  et  al.  Rank  Selection  Method:  Chen  et  al.  [84]  proposed  a  statistical  procedure  for 
detecting  the  multiplicity  of  the  smallest  eigenvalue  of  the  structured  covariance  matrix  using 
statistical  selection  theory.  The  rank  can  be  estimated  from  their  methods  using  pre-calculated 
parameters.  We  refer  to  this  method  as  RCMLchen  • 

•  AIC:  Akaike  [77]  proposed  the  information  theoretic  criteria  for  model  selection.  The  AIC 
selects  the  model  that  best  fits  the  data  for  given  a  set  of  observations  and  a  family  of  models, 
that  is,  a  parameterized  family  of  probability  densities.  Wax  and  Kailath  [79]  proposed  the 
method  to  determine  the  number  of  signals  from  the  observed  data  based  on  the  AIC.  We 
compare  Wax  and  Kailath’s  method. 


DISTRIBUTION  A:  Approved  for  public  release 


62 


average  SINR  (dB) 


20  30  40 

K 


Figure  3.4:  Normalized  SINR  in  dB  versus  number  of  training  samples  K  (TV  =  20)  for  the  simulation 
model. 

•  Condition  number  constrained  ML  estimators:  The  maximum  likelihood  estimation 
method  of  the  covariance  matrix  with  a  condition  number  [25]  proposed  by  Aubry  et  al.  is 
considered  for  evaluating  the  performance  with  three  different  condition  numbers.  1)  CNCML 
:  the  condition  number  obtained  by  the  proposed  method  in  [25],  2)  CNCMLel  :  the  condition 
number  obtained  by  the  expected  likelihood  approach,  and  3)  CNCMLtrue  :  the  true  condition 
number. 

3.3.2  Imperfect  rank  constraint 

First,  we  compare  the  rank  estimation  method  proposed  in  Section  3.2.1  with  alternative  algorithms 
including  SMI,  FML,  AIC,  and  Chen’s  algorithm.  We  plot  the  normalized  SINR  (in  dB)  versus  the 
number  of  training  samples,  20,  30,  and  40  in  Fig.  3.4  for  the  simulation  model.  The  SINR  values  are 
obtained  by  averaging  SINR  values  from  500  Monte  Carlo  trials.  It  is  shown  that  the  SINR  values 
increases  monotonically  as  K  increases.  The  RCMLel exhibits  the  best  performance  in  all  training 
regimes.  Particularly,  the  difference  between  RCMLEuand  other  methods  increases  when  training 
samples  are  limited. 

Figure  3.5  shows  the  normalized  SINR  values  for  various  number  of  training  samples  for  the 
KASSPER  data  set.  We  plot  the  averaged  SINR  values  in  decibel  over  either  azimuth  angle  or  Doppler 
frequency  domain.  The  left  and  right  column  show  the  results  for  angle  and  Doppler,  respectively. 
Similarly  to  the  results  for  the  simulation  model,  RCMLELOutperforms  all  the  other  compared  methods 
in  all  training  regime.  This  implies  that  the  rank  obtained  by  the  EL  approach  is  more  accurate  and 
closer  to  the  rank  predicted  by  Brennan  rule  (M  +  P  —  1  =  42)  than  any  other  methods. 

Realistic  case  of  contaminated  observations:  In  practice,  homogeneous  training  samples 
are  hard  to  obtain  and  the  received  signals  are  often  corrupted  by  target  information.  Therefore, 
it  is  meaningful  to  compare  the  performance  for  nonhomogeneous  observation  to  investigate  which 
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Normalized  azimuthal  angle 


Normalized  SINR  in  dB  vs.  Doppler:  352  training  samples 


-4  -3  -2  -1  0  1  2  3  4 

Normalized  Doppler  frequency 


(a.)  K  =  N  =  352 


{h)  K  =  N  =  352 


Normalized  SINR  in  dB  vs.  angle:  528  training  samples 


(c)  K  =  1.5N  =  528 


Normalized  SINR  in  dB  vs.  Doppler:  528  training  samples 


(d)  K  =  1.5N  =  528 


Normalized  SINR  in  dB  vs.  angle:  704  training  samples 


(e)  K  =  2N  =  704 


Normalized  SINR  in  dB  vs.  Doppler:  704  training  samples 


(f)  K  =  2N  =  704 


Figure  3.5:  Normalized  SINR  versus  azimuthal  angle  and  Doppler  frequency  for  the  KASSPER  data 
set. 


algorithm  indeed  works  well  and  is  robust  in  practice.  In  this  case,  the  received  signal  is  given  by 


z  =  as  +  d 


(3.90) 
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Normalized  SINR  in  dB  vs.  angle:  352  training  samples  Normalized  SINR  in  dB  vs.  Doppler:  352  training  samples 


(a.)  K  =  N  =  352 


{h)  K  =  N  =  352 


Normalized  SINR  in  dB  vs.  angle:  528  training  samples  Normalized  SINR  in  dB  vs.  Doppler:  528  training  samples 


Normalized  SINR  in  dB  vs.  angle:  704  training  samples 


(e)  K  =  2N  =  704 


Normalized  SINR  in  dB  vs.  Doppler:  704  training  samples 


(f)  K  =  2N  =  704 


Figure  3.6:  Normalized  SINR  versus  azimuthal  angle  and  Doppler  frequency  for  the  KASSPER  data 
set. 


where  s  and  d  are  the  deterministic  steering  vector  and  stochastic  disturbance  vector,  respectively. 
Figure  3.6  shows  the  normalized  SINR  values  when  a  half  of  the  training  samples  contain  s  with  a  =  50. 
The  gap  between  RCMLELand  the  others  is  bigger  than  that  in  Figure  3.5.  In  particular,  AIC  shows 
compatible  performance  in  homogeneous  cases  though.  Figure  3.6e  and  Figure  3.6f  show  that  the 
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average  SINR  (dB) 

0| - ^ ^ - 


Figure  3.7:  Normalized  SINR  in  dB  versus  number  of  training  samples  K  (TV  =  20)  for  the  simulation 
model. 

difference  between  AIC  and  RCMLELis  larger  for  non-homogeneous  case.  Therefore  the  results  show 
remarkably  that  RCMLELStill  excels  under  target  contamination  or  heterogeneous  training  where 
other  techniques  face  severe  degradation  in  performance. 

3.3.3  Imperfect  rank  and  noise  power  constraints 

In  this  section,  we  show  experimental  results  for  estimation  of  both  a  rank  and  a  noise  power  via 
the  expected  likelihood  approach,  which  is  proposed  in  Section  3.2.2.  In  this  case,  we  assume  that 
both  the  rank  and  the  noise  power  are  unknown  and  to  be  estimated  for  both  the  simulation  model 
and  the  KASSPER  data  set.  Since  the  previous  works  such  as  AIC  and  Chen’s  algorithm  are  for 
only  estimating  the  rank  and  can  not  be  extended  to  estimate  both  the  rank  and  the  noise  power,  we 
compare  the  proposed  EL  method  with  the  sample  covariance,  FML,  and  the  RCML  estimator  with 
a  prior  knowledge  of  the  rank.  For  the  RCML  estimator,  we  employ  the  number  of  jammers  (r  =  5) 
and  the  Brennan  rule  (r  =  42)  as  the  clutter  rank  for  the  simulation  model  and  the  KASSPER  data 
set,  respectively.  In  addition,  since  the  FML  method  requires  a  prior  knowledge  of  the  noise  power, 
we  calculate  and  use  the  maximum  likelihood  estimate  of  the  noise  power  for  a  rank  given  by  a  prior 
knowledge  for  the  FML. 

Figure  3.7  shows  the  performance  of  various  estimators  in  the  sense  of  the  normalized  SINR  values 
for  the  simulation  model.  Similarly  to  the  case  of  only  rank  estimation,  the  proposed  method  show 
the  best  performance  in  all  training  regimes. 

Figure  3.8  shows  the  performance  of  the  methods  in  the  normalized  SINR  for  the  KASSPER  data 
set.  The  proposed  method  is  comparable  with  or  slightly  better  than  the  RCML  estimator  using  the 
rank  by  Brennan  rule.  This  means  that  the  proposed  method  estimates  both  the  rank  and  the  noise 
power  adaptively  from  training  samples  whereas  the  rank  by  Brennan  rule  is  fixed  regardless  of  the 
training  samples. 
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Normalized  SINR  in  dB  vs.  angle:  352  training  samples 


(a.)  K  =  N  =  352 


Normalized  SINR  in  dB  vs.  Doppler:  352  training  samples 


Normalized  Doppler  frequency 


(b)  K  =  N  =  352 


Normalized  SINR  in  dB  vs.  angle:  528  training  samples 


Normalized  SINR  in  dB  vs.  Doppler:  528  training  samples 


Normalized  Doppler  frequency 


(c)  K  =  1.5N  =  528 


(d)  K  =  1.5N  =  528 


(e)  K  =  2N  =  704 


(f)  K  =  2N  =  704 


Figure  3.8:  Normalized  SINR  versus  azimuthal  angle  and  Doppler  frequency  for  the  KASSPER  data 
set. 

3.3.4  Imperfect  condition  number  constraint 

Now  we  show  experimental  results  for  the  condition  number  estimation  method  proposed  in  Section 
3.2.3.  We  compare  the  proposed  method,  denoted  by  CNCMLel,  with  four  different  covariance 
estimation  methods,  the  sample  covariance  matrix  (SMI),  FML,  CNCML  ,  and  CNCMLtrue  . 

Tables  3.1  and  3.2  show  the  normalized  SINR  values  for  the  simulation  model.  We  analyze  five 
different  scenarios  with  different  parameters  of  the  simulated  covariance  model  given  by  Eq.  (3.88). 
We  use  the  same  parameters  as  those  used  in  [25]  to  evaluate  the  performances  and  they  are  shown 
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K 

SMI 

FML 

CNCML 

CNCMLel 

CNCMLtrue 

20 

-9.3785 

-0.5195 

-0.5212 

-0.4822 

-0.5200 

-5 

30 

-4.2579 

-0.4242 

-0.4257 

-0.4256 

-0.4250 

40 

-2.7424 

-0.3460 

-0.3476 

-0.3476 

-0.3468 

20 

-9.3196 

-0.5511 

-0.5521 

-0.5141 

-0.5515 

0 

30 

-4.2276 

-0.4202 

-0.4221 

-0.4220 

-0.4210 

40 

-2.7649 

-0.3513 

-0.3530 

-0.3528 

-0.3521 

20 

-9.0922 

-0.5269 

-0.5279 

-0.4875 

-0.5272 

5 

30 

-4.2172 

-0.4348 

-0.4364 

-0.4362 

-0.4357 

40 

-2.7300 

-0.3484 

-0.3503 

-0.3505 

-0.3493 

20 

-9.3511 

-0.5355 

-0.5305 

-0.4998 

-0.5360 

10 

30 

-4.1955 

-0.4164 

-0.4180 

-0.4175 

-0.4175 

40 

-2.7491 

-0.3501 

-0.3515 

-0.3518 

-0.3509 

(a) 


K 

SMI 

FML 

CNCML 

CNCMLel 

CNCMLtrue 

20 

-9.3069 

-1.7371 

-1.7322 

-1.7358 

-1.7350 

-5 

30 

-4.1795 

-1.2399 

-1.2388 

-1.2347 

-1.2397 

40 

-2.7535 

-0.9496 

-0.9492 

-0.9456 

-0.9493 

20 

-9.1354 

-1.6944 

-1.6928 

-1.7027 

-1.6940 

0 

30 

-4.2345 

-1.2986 

-1.2987 

-1.2955 

-1.2990 

40 

-2.7545 

-1.0041 

-1.0043 

-1.0023 

-1.0046 

20 

-9.2524 

-1.3976 

-1.4016 

-1.3244 

-1.4000 

5 

30 

-4.2309 

-1.0737 

-1.0784 

-1.0666 

-1.0762 

40 

-2.7523 

-0.8848 

-0.8876 

-0.8818 

-0.8866 

20 

-9.3660 

-1.2567 

-1.2569 

-1.2115 

-1.2570 

10 

30 

-4.3013 

-0.9526 

-0.9545 

-0.9450 

-0.9537 

40 

-2.7350 

-0.7171 

-0.7197 

-0.7139 

-0.7186 

(b) 


K 

SMI 

FML 

CNCML 

CNCMLel 

CNCMLtrue 

20 

-9.3702 

-0.5340 

-0.5349 

-0.4925 

-0.5340 

-5 

30 

-4.2791 

-0.4302 

-0.4316 

-0.4315 

-0.4308 

40 

-2.7856 

-0.3493 

-0.3510 

-0.3509 

-0.3501 

20 

-9.2898 

-0.5485 

-0.5501 

-0.5104 

-0.5491 

0 

30 

-4.2648 

-0.4202 

-0.4219 

-0.4220 

-0.4209 

40 

-2.7274 

-0.3604 

-0.3621 

-0.3621 

-0.3611 

20 

-9.0582 

-0.5318 

-0.5328 

-0.4899 

-0.5322 

5 

30 

-4.1548 

-0.4142 

-0.4155 

-0.4152 

-0.4149 

40 

-2.7655 

-0.3515 

-0.3531 

-0.3533 

-0.3521 

20 

-9.3632 

-0.5352 

-0.5363 

-0.4974 

-0.5360 

10 

30 

-4.2728 

-0.4328 

-0.4348 

-0.4349 

-0.4337 

40 

-2.7577 

-0.3538 

-0.3554 

-0.3547 

-0.3547 

(c) 


Table  3.1:  Normalized  SINK  for  various  values  of  parameters  for  the  simulation  model. 


in  Table  5.2(c). 

For  the  narrowband  scenarios  {Bf  =  0)  in  Table  5.1(a)  and  Table  5.1(c),  CNCMLel  outperforms 
the  alternatives  for  the  limited  training  regime  and  FML  is  the  best  in  other  training  regimes.  Note 
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K 

SMI 

FML 

CNCML 

CNCMLel 

CNCMLtrue 

20 

-9.0316 

-1.7161 

-1.7131 

-1.7634 

-1.7150 

-5 

30 

-4.1465 

-1.1704 

-1.1691 

-1.1659 

-1.1693 

40 

-2.7727 

-0.9390 

-0.9384 

-0.9351 

-0.9387 

20 

-9.2091 

-1.6706 

-1.6701 

-1.6674 

-1.6706 

0 

30 

-4.2004 

-1.2681 

-1.2682 

-1.2633 

-1.2682 

40 

-2.7423 

-1.0102 

-1.0117 

-1.1009 

-1.0106 

20 

-9.3538 

-1.3980 

-1.4028 

-1.3216 

-1.4004 

5 

30 

-4.2203 

-1.0869 

-1.0910 

-1.0785 

-1.0889 

40 

-2.7079 

-0.8694 

-0.8721 

-0.8666 

-0.8713 

20 

-9.221 

-1.2446 

-1.2455 

-1.1982 

-1.2452 

10 

30 

-4.2116 

-0.9428 

-0.9460 

-0.9382 

-0.9444 

40 

-2.7563 

-0.7235 

-0.7264 

-0.7226 

-0.7253 

(a) 


K 

SMI 

FML 

CNCML 

CNCMLel 

CNCMLtrue 

20 

-9.2679 

-1.1593 

-1.1616 

-1.1150 

-1.1610 

-5 

30 

-4.2234 

-0.9262 

-0.9286 

-0.9242 

-0.9278 

40 

-2.8271 

-0.7705 

-0.7729 

-0.7712 

-0.7723 

20 

-9.2934 

-0.9052 

-0.9051 

-0.8422 

-0.9059 

0 

30 

-4.1617 

-0.6909 

-0.6920 

-0.6862 

-0.6924 

40 

-2.7387 

-0.5711 

-0.5724 

-0.5676 

-0.5725 

20 

-9.4154 

-0.8398 

-0.8334 

-0.7909 

-0.8399 

5 

30 

-4.2284 

-0.6273 

-0.6231 

-0.6070 

-0.6278 

40 

-2.7208 

-0.5034 

-0.5022 

-0.4945 

-0.5046 

20 

-9.1447 

-0.7388 

-0.7225 

-0.6815 

-0.7392 

10 

30 

-4.2046 

-0.5931 

-0.5803 

-0.5535 

-0.5931 

40 

-2.7241 

-0.4821 

-0.4738 

-0.4576 

-0.4827 

(b) 


J 

^,7 

Bf 

(a) 

1 

30 

20° 

0 

(b) 

1 

30 

20° 

0.3 

(c) 

3 

[30  30  30] 

[20°  40°  60°] 

[0  0  0] 

(d) 

3 

[30  30  30] 

[20°  40°  60°] 

[0.3  0.3  0.3] 

(e) 

3 

[10  20  30] 

[20°  40°  60°] 

[0.2  0  0.3] 

(c) 


Table  3.2:  Normalized  SINK  for  various  values  of  parameters  for  the  simulation  model. 


that  the  gap  between  CNCMLel  and  FML  (at  most  0.002)  is  much  smaller  than  that  of  the  limited 
training  regime  (at  least  0.3). 

On  the  other  hand,  for  the  wideband  scenarios  in  Tables  5.1(b),  5.2(a),  and  5.2(b),  CNCMLel 
shows  the  best  performance  in  most  cases  including  CNCMLtrue  using  the  true  condition  number. 

The  experimental  results  for  the  KASSPER  data  set  are  shown  in  Figure  3.9.  We  do  not  plot 
the  sample  covariance  matrix  to  clarify  the  difference  among  the  estimators.  In  every  case,  FML, 
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Normalized  SINR  in  dB  vs.  angle:  352  training  samples 
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Normalized  SINR  in  dB  vs.  Doppler:  352  training  samples 
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Figure  3.9:  Normalized  SINR  versus  azimuthal  angle  and  Doppler  frequency  for  the  KASSPER  data 
set.  (a)  and  (b)  for  iF  =  =  352,  (c)  and  (d)  for  K  —  1.5N  =  528,  and  (e)  and  (f)  for  K  =  2N  =  704 


CNCML,  and  CNCMLtrue  are  very  close  to  one  another  and  CNCMLel  is  the  best  estimator.  Note 
that  CNCMLel  is  based  on  the  same  algorithm  as  CNCML  and  CNCMLtrue  and  differs  from  them 
in  the  point  that  CNCMLel  uses  a  different  condition  number  which  is  estimated  by  the  expected 
likelihood  approach.  Again,  this  shows  that  the  expected  likelihood  criterion  is  really  useful  and 
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powerful  to  estimate  parameters  which  is  imperfectly  known  and  leads  to  an  adaptive  and  robust 
covariance  estimator. 


3.4  Conclusion 

We  propose  robust  covariance  estimation  algorithms  which  automatically  determine  the  optimal  values 
of  practical  constraints  via  the  expected  likelihood  criterion  for  radar  STAP  in  this  chapter.  Three 
different  cases  of  practical  constraints  which  is  exploited  in  recent  works  including  the  rank  constrained 
ML  estimation  and  the  condition  number  constrained  ML  estimation  are  investigated.  Significant 
analytic  results  and  proofs  are  derived  for  each  case.  Uniqueness  of  the  optimal  values  of  the  rank 
constraint  and  the  condition  number  constraint  is  formally  proved  and  a  closed  form  solution  of  the 
noise  level  is  derived.  Experimental  results  show  that  the  estimators  with  the  constraints  obtained  by 
the  expected  likelihood  outperform  previous  works  which  uses  constraints  obtained  by  other  parameter 
estimation  methods  including  the  maximum  likelihood  solution  of  the  constraints. 
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